Don't be caught looking like a fool when your website fails unexpectedly. Find out errors ahead of time using automatic testing techniques. In this article, we discuss automatic regression testing. Use PHP to setup your own continuous integration server.
When you are responsible for updating a website that is in active use, you want to feel confident that the changes you make to the code do not break any currently working pages. Nothing is worse than making a small change to fix one problem and find out later through a customer that the unintended consequence causes another page to stop working. Prevent yourself from experiencing that embarrassment and setup automated webpage checking. It's not too difficult and you will thank yourself later when you identify errors before pushing changes made on your development server to the production server.
Types of Testing
- Unit Testing - Using a tool like PHPUnit tests the individual components of your programs and will be run each time code changes are made.
- Regression Testing - The subject of this article are tests help verify that the webpage works as a whole. It is designed to supplement unit testing and provide early notification when any page or web service breaks. Scheduling this test to run repeatedly is the next best thing to having your own continuous integration service, but is much easier to setup.
- Integration/UI Testing - Using a UI simulator like Selenium tests the end-user's experience on the website. While useful, this is time consuming to setup and is easily defeated when site layout is changed. This testing is beyond the scope of this article.
First Things First, the Overall Process
- Separate development servers from production. It is recommended to have a separate development server and production server. Development servers can be low power and low memory machines which are not only inexpensive to maintain, but have the added benefit of exposing any performance issues early.
- Perform unit testing on the development server each time you make code changes.
- Perform automatic regression tests from a different computer to the one you are checking. If the computer you are testing goes down, you want to be notified. If your testing is done on the same computer, then if that computer fails, you run the risk of not being notified when there is a serious problem. Run an automated task on the production server that checks the development server.
- Test in both directions. Run an automated task on the development server that checks the production server.
- Confidently push code changes to the production server after testing successfully passes.
Writing a Regression Test - Get What You Expect, Not What You Don't
Get the first page you want to check and identify three things, the URL of the page, the content that is expected to be on the page (i.e., page title), and content that should never be on the page (error messages, etc.). The test will verify that content you expect to be there is present and that content you don't want is not present. This could be any page you want and I am using checkliststogo.com's homepage (a site I develop and maintain). The error messages I use are geared towards a Linux, PHP, and Apache server, but the same basics apply to other systems. Make sure error reporting is on in your programming language (for PHP, this can be done by setting error_reporting = E_ALL in /etc/php.ini).
URL - "www.checkliststogo.com"
Expected content - "Checklists ToGo", "Popular Checklists", " WorxForUs ©"
Error content -"Parse error: syntax error", "{local filesystem root path to your site}", in my case: "/home/ec2-user/www/htdocs/"
Using the local filesystem page path in the check for error content combined with the web server language displaying errors is an extremely powerful and easy way to check for site errors. Syntax errors, database errors, run-time errors, and all kinds of problems are all easily detected since errors report a filesystem trace including the root path when errors occur. NOTE: The relative path below the filesystem root should not be used in the error detection since those strings will be found in page links.
Running the Regression Test - Main Code
Now that we know the site and what strings to check for, we can build the program to run the actual test in a file called sample_index_test.php.
<?php
include_once("validate_site_helper.php");
$url = "http://www.checkliststogo.com";
$err_arr = array();
$err_arr[] = "Parse error: syntax error";
$err_arr[] = "/home/ec2-user/www/htdocs/";
$pass_arr = array();
$pass_arr[] = "Checklists ToGo";
$pass_arr[] = "Popular Checklists";
$pass_arr[] = "WorxForUs ©";
$result = validate_site_helper::check_site($url, $err_arr, $pass_arr, basename(__FILE__));
if (!$result->success) {
$message = "Page {$url} testing failed - {$result->error}";
handle_error_notification($message);
} else {
echo ("Site {$url} is ok");
}
function handle_error_notification($message) {
echo ("ERROR: {$message}");
}
?>
The validate site helper encapsulates all this checking and returns a result object that lets you know how the testing went.
The error notification is going to be different for each system and is beyond the scope of this article. In my case, I like to use Amazon Simple Email Service (tutorial here) to send emails to myself when errors are detected and find that works very well.
Validate Site Helper - Code
The validate_site_helper
does all the hard work of getting the URL page content, parsing the text for the expected and error strings and then returns the result.
<?php
class validation_result {
public $success = true;
public $error = "";
public $subject = "";
}
class validate_site_helper {
protected static function check_site_helper($site_content, $host_url,
$err_indicators_arr, $pass_indicators_arr, $calling_file) {
$result = new validation_result();
try {
$ctg_content = $site_content;
foreach ($err_indicators_arr as $err_str) {
if (stristr($ctg_content, $err_str)) {
$result->success = false;
$result->error .= "Suspected error indication:
'$err_str' found in generated page content.\r\n";
}
}
foreach ($pass_indicators_arr as $pass_str) {
if (!stristr($ctg_content, $pass_str)) {
$result->success = false;
$result->error .= "Validation indication:
'$pass_str' was not found in generated page content.\r\n";
}
}
if (!$result->success) {
$result->subject = "$host_url - Warning - $calling_file";
}
} catch (Exception $e) {
$result->success = false;
$body = $e->getMessage()."\r\n".$e->getTraceAsString();
$result->error .= $body;
$result->subject = "$host_url - Execution Error - $calling_file";
}
return $result;
}
public static function check_site($host_url, $err_indicators_arr, $pass_indicators_arr, $calling_file) {
$result = new validation_result();
try {
$ctg_content = file_get_contents($host_url);
$result = validate_site_helper::check_site_helper($ctg_content,
$host_url, $err_indicators_arr, $pass_indicators_arr, $calling_file);
} catch (Exception $e) {
$result->success = false;
$body = $e->getMessage()."\r\n".$e->getTraceAsString();
$result->error .= $body;
$result->subject = "$host_url - Execution Error - $calling_file";
}
return $result;
}
public static function check_site_with_post($host_url, $post_params_array,
$err_indicators_arr, $pass_indicators_arr, $calling_file) {
$result = new validation_result();
try {
$options = array(
'http' => array(
'header' => "Content-type: application/x-www-form-urlencoded\r\n",
'method' => 'POST',
'content' => http_build_query($post_params_array),
),
);
$context = stream_context_create($options);
$ctg_content = file_get_contents($host_url, false, $context);
$result = validate_site_helper::check_site_helper
($ctg_content, $host_url, $err_indicators_arr, $pass_indicators_arr, $calling_file);
} catch (Exception $e) {
$result->success = false;
$body = $e->getMessage()."\r\n".$e->getTraceAsString();
$result->error .= $body;
$result->subject = "$host_url - Execution Error - $calling_file";
}
return $result;
}
}
?>
Validation Results
This works by using PHP's built in file_get_contents
function which grabs the contents of the URL from the server. That is handled beneath a try
function that captures errors such as the page not being found and allows the script to continue and report the error back to the user. Otherwise, if the page could not be retrieved, the notification code would not execute which would be a big problem.
The returned validation_result
object is just a holder to pass along the results of the validation. When you get the results, you'll want to pass them on somewhere to let the developer know that an error has occurred. In the sample code here, we are just outputting to the screen.
This code was tested under multiple failure scenarios, including:
- Server is offline (IP address could not be resolved)
- Page is not authorized
- Page does not exist
- Page is blank
ERROR: Page http://www.checkliststogo.com/ctg/app testing failed - Validation indication: 'Checklists ToGo' was not found in generated page content. Validation indication: 'Popular Checklists' was not found in generated page content. Validation indication: 'WorxForUs ©' was not found in generated page content.
- Page is OK
Site http://www.checkliststogo.com/ctg/app is ok.
Automating the Testing
Of course, you will need to have added the email notification (or other system) since cron will only output to the console and you will not see it directly.
If you find this code useful, please let me know in the comments, give a +1, or send a smiley cat picture.