Introduction
Working with customers can be very rewarding, but when things go wrong � particularly
general protection faults (GPF) in software running on production systems �
everyone starts singing the blues. Customers respond panic-stricken; engineers
scramble for information about what went wrong: everyone is unhappy.
This is not a good way to get things done � an ounce of prevention is worth a pound of
cure. The best way to protect against the GPF blues is to be prepared. If
your system goes down, it should be able to gather all the information needed
to assess the failure, quickly, with little or no customer assistance.
How can one realize this goal? Accept no imitations: thorough testing throughout the
development lifecycle of the application is fundamental (good design helps
too). Unfortunately, in large complex systems testing does not serve as a
perfect guarantee and a safety net is required (here's a nickel to all the
smirking EE's in the crowd). Exception filters provide excellent protection
for unexpected emergencies.
Overview
Exception filters enable an application "to supersede the top-level exception handler of
each thread and process." What does that mean? When something goes terribly
wrong, the top-level exception handler is typically the first and last person
to know.
Flexibility:
Properly implemented, the handler that deals with an exception is independent of the
plumbing that delivers the exception. This design criterion promotes reuse of handlers
between projects � a good example is the exception reporter and mailer. The
filter pipes the exception to the handlers who in turn write a report out, mail
it, then shutdown:
int main(int, char**) {
exception::filter<exception::shutdown>::install();
exception::filter<exception::mailman>::install();
exception::filter<exception::report>::install();
Granularity:
Several people work on a large system; each part is independent, but will be integrated
in the end. All require specialized exception handling processing (GPF
teardown processing). How can we meet these requirements? A good design
allows the code to be partitioned by subsystem, and allows the end user to
control the order in which the handlers are called:
exception::filter<controller::gpf_handler>::install();
exception::filter<logging::gpf_handler>::install();
exception::filter<loader::gpf_handler>::install();
exception::filter<transport::gpf_handler>::install();
As subsystems are added, specific handlers may be added to accommodate their
special requirements.
Design
The exception::filter
class (singleton) provides
a public interface to install, uninstall and configure the exception filter;
exception::filter
privately interfaces with the operating system.
A template parameter will serve as the action executed when an unhandled exception occurs.
New filters installed chain back to the previous filter installed.
Figure 1. Filter collaboration diagram
Implementation
::SetUnhandledExceptionFilter
does all the work.
It accepts a callback as an input parameter and returns the previous filter.
The previous filter is saved providing a simple mechanism to
chain filters. The callback accepts PEXCEPTION_POINTERS
as an input parameter that provides context about what went wrong.
Note that the chaining mechanism disallows uninstallation of filters in any order save
the reverse order of the installation. Thus:
exception::filter<exception::shutdown>::install();
exception::filter<exception::report>::install();
exception::filter<exception::report>::uninstall();
exception::filter<exception::shutdown>::uninstall();
is well-defined whereas:
exception::filter<exception::shutdown>::install();
exception::filter<exception::report>::install();
exception::filter<exception::shutdown>::uninstall();
results in undefined behavior.
A simple handler: exception::shutdown
Our first handler's job is a simple but necessary one: terminate the
program that has faulted as quickly as possible. Thus, our handler becomes:
struct shutdown {
static void handler(const char*, PEXCEPTION_POINTERS) { ::exit(-1); }};
We rely on other handlers to report and notify us of the fault before
our first and final handler tears the process down.
The report handler: exception::report
Shutting down is not enough to determine what went wrong, context is required
to correctly diagnose a problem in the field. The best context often is
a process/system snapshot at the time of failure: register information,
call stack, loaded modules, processes, running threads.
Matt Pietrek implemented an exception reporter in the April 1997 edition of
Microsoft Systems Journal using the Windows imagehlp library. I have adapted
this code for use.
Using the code
To define a handler, implement a struct or class who defines a static function
'handler' with the following signature:
struct report
{
static void handler(const char* sz_log, PEXCEPTION_POINTERS p_info)
The first parameter denotes where any log information should go, the second
provides necessary context.
To install an exception filter use exception::filter::install
.
plugware::exception::filter<plugware::exception::shutdown>::install();
plugware::exception::filter<plugware::exception::report>::install();
plugware::exception::filter<gpf_handler_1>::install();
plugware::exception::filter<gpf_handler_2>::install();
To uninstall a filter, use exception::filter::uninstall. You must uninstall the
filters in the reverse order you installed them. Failure to do so results in
undefined behaviour.
exception::filter<exception::report>::uninstall();
exception::filter<exception::shutdown>::uninstall();
About the demo program
The demo project installs two user defined handlers (gpf_handler_1,
gpf_handler_2), and two built-in handlers (shutdown, report). An intentional
fault is generated when a null pointer is dereferenced. The application will
not run successfully under a debugger - the debugging environment installs filters
that supersede all previously installed filters.
To run the program compile it and run from the command line.
Conclusion
As software grows more complex a vendor's ability to make strong guarantees about
the lifetime and behavior of an application diminish. While thorough and
adequate testing throughout the lifecycle of a product is important, if
something can go wrong
it will.
While not a panacea, exception filters provide ample opportunity to gather
context to ensure that unforeseen problems are quickly diagnosed and treated
in the field. Happy Coding!
History
- 30/04/2004 UML collaboration diagram added
- 30/04/2004 History section added
- 27/04/2004 Article creation