Introduction
I have used the code and the mechanism described in this article for almost 20 years now and so far, I didn't find a better method for error handling in large C++ projects. The original idea is taken from an article that appeared in Dr Dobbs Journal back in 2000. I have added to it a few bits and pieces to make it easier to use in production environment.
The impulse to write this article was a recent posting on Andrzej's C++ blog. As we'll see later in this article, using the error code objects can produce significantly cleaner and easier to maintain code.
Background
Every C++ programmer learns that there are two traditional methods to deal with abnormal conditions: one, inherited from good old C is to return an error code and hope that the caller will test it and take an appropriate action; the second one is to throw an exception and hope that a surrounding block has provided a catch handler for that exception. C++ FAQ strongly advocates for the second method arguing that it leads to safer code.
Using exceptions however has also its own drawbacks. Code tends to become more complicated and users have to be aware of all exceptions that can be thrown. This is why older C++ specification had "exception specification" added to function declarations. In addition, exceptions tend to make code less efficient.
Error code objects (erc
) are designed to be returned by functions like the traditional C error codes. The big difference is that, when not tested, they throw an exception.
Let us take a small example and see how the different implementations would look like. First, the "classical C" approach with traditional error codes:
int my_sqrt (float& value) {
if (value < 0)
return -1;
value = sqrt(value);
return 0;
}
main () {
double val = -1;
if (my_sqrt (val) == -1)
printf ("square root of negative number");
my_sqrt (val);
assert (val >= 0);
}
If the result is not checked, all kind of bad things can happen and we have to be prepared to use all traditional debugging tools to find out what went wrong.
Using "traditional" C++ exceptions, the same code could look like this:
void my_sqrt (float& value) {
if (value < 0)
throw std::exception ();
value = sqrt(value);
}
main () {
double val = -1;
try {
my_sqrt (val);
} catch (std::exception& x) {
printf ("square root of negative number");
}
my_sqrt (val);
assert (val >= 0);
}
This works great in a small example like this because we can see what the my_sqrt
function does and pepper the code with try
...catch
blocks.
If however, the function is buried deep in a library, you might not know what exceptions it might throw. Note that the signature of my_sqrt
doesn't give any clue as to what, if any, exceptions it might throw.
And now... drumroll... here are the erc
objects in action:
erc my_sqrt (float& value) {
if (value < 0)
return -1;
value = sqrt(value);
return 0;
}
main () {
double val = -1;
if (my_sqrt (val) == -1) printf ("square root of negative number");
try {
my_sqrt (val);
} catch (erc& x) {
printf ("square root of negative number");
}
my_sqrt (val);
assert (val >= 0);
}
A few observations before diving into the magic of how this works:
A First Look at Error Code Objects
For a "big picture" presentation, we are going to ignore some details but we'll get back to those in a moment.
When an erc
object is created, it has a numerical value (like any C error code) and an activity flag that is initially set.
class erc
{
public:
erc (int val) : value (val), active (true) {};
private:
int value;
bool active;
}
If the object is destructed and the activity flag is set, the destructor throws an exception.
class erc
{
public:
erc (int val) : value (val), active (true) {}
~erc () noexcept(false) {if (active) throw *this;}
private:
int value;
bool active;
}
So far, still nothing very special: this is an object throwing an exception, albeit doing it during the destructor execution. Nowadays, this is frowned upon and that is why we have to decorate the destructor declaration with noexcept(false)
.
The integer conversion operator returns the numerical value of the erc
object and resets the activity flag:
class erc
{
public:
erc (int val) : value (val), active (true) {}
~erc () noexcept(false) {if (active) throw *this;}
operator int () {active = false; return value;}
private:
int value;
bool active;
}
Because the activity flag has been reset, the destructor will no longer throw an exception when the object goes out of scope. Typically, the integer conversion operator is invoked when the error code is tested against a certain value.
Looking back at the simple usage example, at the comment marked (1), the erc
object returned by the function my_sqrt
is compared with an integer value and this invokes the integer conversion operator. As a result, the activity flag is reset and the destructor doesn't throw. At the comment marked (2), the returned erc
object is destroyed after my_sqrt()
returns and, because its activity flag is set, the destructor throws an exception.
Following a well-established Unix convention, and because, as Aristotle was saying there is only one way to succeed, the value '0
' is reserved to indicate success. An erc
with a value of 0
, never throws an exception. Any other value indicates failure and generates an exception (if not tested).
This is the essence of the whole idea of error code objects as presented in the Dr. Dobbs Journal article. However, I couldn't resist the temptation to take a simple idea and make it more complicated; keep reading!
More Details
The "big picture" presentation has ignored some details that are needed to make error codes more functional and for integrating them in large scale projects. First, we need a move constructor and a move assignment operator that borrows the activity flag from the copied object and deactivates the copied object. This ensures that we have only one active erc
object.
We also need a mechanism for grouping classes of error codes together for easy handling. This mechanism is implemented through error facility objects (errfac
). In addition to the value and activity flag attributes, the erc's have also a facility and a severity level. The erc destructor does not directly throw an exception as we have shown before, but instead it invokes the errfac::raise
function of the associated facility object. The raise
function compares the severity level of the erc
object against a throw level and a log level associated with each facility. If the error code's priority is higher than the facility's log level, the errfac::raise()
function invokes the errfac::log()
function to generate an error message and throws the exception or logs the error only if the preset levels are exceeded. The severity levels are borrowed from the UNIX syslog function:
Name | Value | Action |
ERROR_PRI_SUCCESS | 0 | always not logged, not thrown |
ERROR_PRI_INFO | 1 | default not logged, not thrown |
ERROR_PRI_NOTICE | 2 | default not logged, not thrown |
ERROR_PRI_WARNING | 3 | default logged, not thrown |
ERROR_PRI_ERROR | 4 | default logged, thrown |
ERROR_PRI_CRITICAL | 5 | default logged, thrown |
ERROR_PRI_ALERT | 6 | default logged, thrown |
ERROR_PRI_EMERG | 7 | always logged, thrown |
By default, the error codes are associated with a default facility but one can create different facilities to regroup classes of errors. For instance, you can create a specialized error facility for all socket errors that knows how to translate the numerical error codes into meaningful messages.
Having different error levels can be useful for test or debugging purposes when one can vary the throwing or logging level for a class of errors.
A More Realistic Example
The blog article mentioned before shows the basic layout of an HTTP client program:
Status get_data_from_server(HostName host)
{
open_socket();
if (failed)
return failure();
resolve_host();
if (failed)
return failure();
connect();
if (failed)
return failure();
send_data();
if (failed)
return failure();
receive_data();
if (failed)
return failure();
close_socket(); return success();
}
The issue here is that an early return can produce a resource leak because the socket is not closed. Let's see how error codes could be used in this situation.
If we want to use exceptions, the code could look like this:
erc open_socket ();
erc resolve_host ();
erc connect ();
erc send_data ();
erc receive_data ();
erc close_socket ();
erc get_data_from_server(HostName host)
{
erc result;
try {
open_socket ();
resolve_host ();
connect ();
send_data ();
receive data ();
} catch (erc& x) {
result = x; }
close_socket (); return result;
}
Without exceptions, the same code can be written as:
erc open_socket ();
erc resolve_host ();
erc connect ();
erc send_data ();
erc receive_data ();
erc close_socket ();
erc get_data_from_server(HostName host)
{
erc result;
(result = open_socket ())
|| (result = resolve_host ())
|| (result = connect ())
|| (result = send_data ())
|| (result = receive data ());
close_socket (); result.reactivate ();
return result;
}
In the fragment above, result
has been converted to an integer because it has to participate in the logical OR expression. This conversion resets the activity flag so we have to explicitly turn it on again by calling the reactivate()
function. If all the functions have been successful, result is 0
and, by convention, it will not throw an exception.
Final Touches
The source code attached is production quality and reasonably well optimized. Hopefully, that doesn't make it much harder to use. The demo project is a C++ wrapper for the popular SQLITE database. It is much bigger in size just because it includes the latest version (as of this writing) of the SQLITE code. Both the source code and the demo project include Doxygen documentation.
History
- 12th November, 2019: Initial version