Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Defensive programming

0.00/5 (No votes)
5 Aug 2004 1  
Some thoughts on how to write software defensively

Introduction

I wrote an article a few months ago about the use of ASSERTs[^] during program development in C++, more particularly focussed on MFC. Throwing such an article into the bearpit of CodeProject naturally generated a lot of comments about how ASSERT is only half of the question. Actually I think ASSERT is rather less than half the question and this article is my attempt to present some other techniques useful in catching programming errors before software reaches the hands of those who pay us for our efforts.

I'm assuming you've read my previous article about ASSERT and that I don't need to go into that topic again. If you haven't, there's the link up there.

Nevertheless, as a quick recap, ASSERT lets you, the programmer, state up front, in debug builds, that certain conditions must be met before program execution continues. If those conditions aren't met the program will abort and, on machines with the debugger installed, will break into the debugger on the ASSERT test that failed.

The problem with ASSERT is obvious. ASSERT is only present in debug builds of your software and it's 'break into the debugger' behaviour only works on machines with a debugger installed. We can't (and shouldn't) assume that our users have a debugger installed. And of what use is breaking into the debugger on a machine that doesn't have access to the program database files? I've worked for a few organisations in my time and not one of them has allocated resources so that QA testing could be performed on debug builds. QA testing seems only to be performed on release builds so we need techniques for catching errors that work in release builds as well as debug builds.

Scope

Let's be clear here. I'm not going to be presenting code snippets that will magically make your programs correct. Nor will I try and discuss 'correctness' in the absolute sense. Correctness can be defined many ways, most of them outside the scope of my intent. I'm concerned with writing code in such a way as to avoid crashes and the grosser sins of 'incorrectness'. I should also explicitly state that what follows is my opinion based on a decade and a half or so of writing C and C++ code. Some will feel that I emphasise one technique more than I should, or under-rate another technique. But this is what works for me.

I also want to state that I can barely remember the last time I used a non-Microsoft compiler. That was when I used Microsoft C version 2 (which was a repackaging of Lattice C). So obviously all my comments here relate to my experiences with Microsoft C/C++. I also have no intention to migrate to compiler X so please, let's not clutter up the comment section with evangelisation for compiler X!

Writing correct code takes time and effort

I can't emphasise this enough. If you want to write correct code you have to be prepared to take the time to plan ahead, think about what you're writing and think about how to avoid problems. You have to know where your code can fail and be prepared to handle those failures.

The most important thing you can do

to eliminate errors and crashes in your program is to not use pointers. There's a reason why languages such as C# and Visual Basic don't support pointers and that reason is that a pointer can be pointed at anything at all. It's important to remember that modern operating systems enforce the concept of memory ownership. Your process owns some memory pages, not the entire address space. If, by accident, a pointer happens to point at a memory address not owned by your process and you try to read or write that memory the processor will throw an exception and the operating system will probably terminate your process. This will earn you the undying hatred of your users.

Alas, my favourite language is C++, so I can't avoid pointers. What to do? There are two things you can do. The first is that whenever you get a pointer from somewhere test it . Make sure it's not NULL. Make sure that if it's a pointer you're going to write through that it's writeable before attempting that write. Make sure that if it's a pointer you're going to read from that it's readable before attempting the read. And if your code is returning a pointer make very sure that you return a pointer that matches the contract your code signed up to. If, for whatever reason, you can't return a valid pointer then return a NULL pointer and hope that your caller checks for NULL.

Contracts

Which advice brings up the queston of a contract. If you've done any COM programming you should be familiar with the concept of programming contracts. A COM interface is defined, coded and released, at which point the 'contract' is defined. If I implement a COM interface of type X my interface must implement certain functionality as defined in the X COM interface. If I request an interface of type Y from somewhere I can expect that it will implement certain functionality.

But why should a COM interface be treated as special? Any API defines a 'contract' and should honour that contract. If I call the CreateWindow() API I expect to get one of two things, either a valid window handle or an error code telling me the reason the window couldn't be created. As the caller of the API it's up to me to check that it succeeded; as the implementer of an API it's up to me to define the contract, stick to it and return accurate error codes when I can't.

This, of course, can be extended if I'm defining a family of API's. Suppose I'm writing an entire subsystem that creates windows and manipulates them. I expect you to call CreateWindow() to create a window and I'll pass you back some handle to the object representing that window. If I can't create that window for whatever reason I'll pass you back an error code. Later, when you manipulate that window in some way, you pass me the handle expecting me to do the correct thing. The correct thing, in my philosopy, is to do some validation on the thing you passed me as a handle, to make sure it really is a handle. Did my API create it? Does it represent valid memory? Does the thing it represents pass consistency checks? If it contains pointers to other things, do those pointers point at valid things? If the thing you pass my API isn't a handle, or if it's a handle you don't have the right to manipulate, or if it's internally inconsistent, the correct thing for me to do is not perform the operation but return you an error code stating the reason.

I talked about creating windows just now where we know quite well that windows are created by our hosting operating system. It's an easy example. The point is that any time you write software containing more than one function you're defining your own API. It's an API that may never be published to the wider world, it may never be used outside the context of the program itself, but it's an API nonetheless and it needs to define it's contract with the rest of your code and stick to that contract. Likewise, your code needs to be aware of the contract and do the appropriate tests.

Let the compiler do the work

C++ is a strongly typed language. You can't, in C++, define a function that takes a char pointer and pass it an int pointer unless you're prepared to do one of two things. You either turn error checking off during the compile or you cast the pointer to the correct type. If you're prepared to turn off error checking in the compiler then hit the back button on your browser now and go read some other article.

Still reading? Good! We'll come back to casts shortly.

I always override the default compiler settings and specify the maximum Warning Level and Convert Warnings to Errors in my projects. The compiler does an awful lot of consistency checking and it has a much more accurate memory of what it's seen than I do. It'll remember that I defined SomeFunc(int value) to take an int and notice if I make a mistake and call SomeFunc() with an unsigned short. In short (pun intended), it'll notice that I haven't adhered to the contract and throw up an error message.  Having seen the error message I know that I'm passing the wrong datatype to the function and it's up to me to examine the calling code and the called code and reconcile the difference.  

It can be a pain in the bum to satisfy the compiler. There you are in a frenzy of coding, in the zone as it were. You hit the compile button and out come a hundred warnings and no executable. But every one of those warnings is a potential bug! And every potential bug you can fix at that point in time is one less bug that will escape into the wild when you release your software. It's also far less expensive to fix that bug right now than to fix it once your software is released.

Casts

Sometimes you just can't avoid casts. But I'd argue that those times are when you're interoperating with code outside of your control. The best example is a Windows message handler. You get a window handle, a message specifier, a WPARAM and a LPARAM. Let's ignore the 16/32 bit issue and just consider the WPARAM and LPARAM values. They can represent just about any datatype you like so long as that datatype fits within the sizeof defined for that parameter.

Thus, an LPARAM might be an int or it might be NULL or it might be a pointer to some data. Your message handler must 'know' what that type is, and the type always depends on which message you're handling. In these cases the use of a cast is mandatory if you're sticking to the earlier suggestion that you let the compiler do the work.

But if you're calling code you wrote and you need to perform a cast I'd suggest you need to examine your code very carefully indeed. Either the calling code doesn't understand the programming contract in your API or the code being called doesn't implement it the way you thought it did. I've written thousands of casts but the only times I've ever had to write a cast in code calling other code I've written is when the call is indirect, such as casting the void parameter to a thread procedure back to my own data type.

Using const

Using const won't magically make your software more reliable. What it will do is make you think more carefully about what you're doing. Ensuring const correctness takes time and, initially, much wailing and gnashing of teeth. To be honest, I've only recently started payng close attention to const correctness. But I've found that the effort it takes pays off in terms of reduced surprise. And it does get easier with practise. It's pretty easy to get into the habit of declaring parameters as const. We all do it right now, every time we declare a parameter as being of type LPCTSTR .

More importantly, it's possible to specify a member function as const. This means you can't change any member data in the function unless that member data is marked as mutable. The benefit is that you know that calling a member function marked const won't have any side effects. Well, that's almost true. Calling a member function marked const means that you can't change any non mutable data members of that object instance. That doesn't mean it can't change global variables or call some operating system API that might have unexpected side effects.

Using enums

In another article[^] I discussed using enums as 'magic' values where appropriate. A few people misunderstood the article and thought it should have been a treatise on object oriented design. So I'll restate the point I was trying to make.

enum s should be used whenever you have a series of discrete and possibly disjoint values. The enum itself defines the list of possible values and an 'umbrella' datatype that represents the set of values. For example,

enum verycontrivedexample
{
    Bob,
    Rob,
    Chris,
    Roger
};
This defines a datatype called verycontrivedexample and a set of values. Now suppose we write a function like this,
void MyFunction(verycontrivedexample data)
{
    //  do something with the data passed...

}
The only way we can call this function and expect the compiler to actually compile it is to pass one of the values defined within the verycontrivedexample enum. I can pass this function a Bob or a Rob but I can't pass it 42 and expect it to compile. Remember 'let the compiler do the work'? If you define your 'magic' values as enums and use the enum tag as the datatype the compiler will complain when you try to pass a value that isn't defined within the enum.

As with const, it takes a little time to get used to doing this, but the payback is well worth it.

Buffer checks

Sasser, Blaster, Nimda, Code Red, the list of viruses that exploit buffer overflows goes on. While it's unlikely our applications will ever be a target for viruses in the same way that an operating system is it's still important to be sure that we don't abuse our environment. For example, I might write this code

void MyFunc(LPCTSTR szString)
{
    TCHAR buffer[10];

    _tcscpy(buffer, szString);
}
        
which might be benign. When I designed my program I 'knew' that the string passed to MyFunc would never be longer than 9 bytes. Who knows, perhaps that will be true for the lifetime of my program. But it's much safer to write

#define countof(x) (sizeof(x) / sizeof(x[0]))   // Defined in some global 

                                                // header


void MyFunc(LPCTSTR szString)
{
    TCHAR buffer[10];

    memset(buffer, 0, sizeof(buffer));
    _tcsncpy(buffer, szString, countof(buffer) - 1);
}
        

which zeroes the buffer and then copies a maximum of 9 bytes (in this example) to the buffer. Note that both steps are necessary. The first step is to zero the buffer. I do this because my use of TCHAR implies that this is a string which will presumably be passed to other API's that expect string semantics. By zeroing the buffer I ensure that string semantics are observed even in the case where the LPCTSTR parameter points to a string shorter than the buffer. The second step does the actual copy. It copies a maximum of countof(buffer) - 1 chars to the buffer. This ensures that I won't copy more into the buffer than it can hold. The -1 part is also important. That ensures that the final character in the buffer is a zero, ensuring string semantics. Note that I use the countof() macro which accounts for whether this is an ANSI/MBCS or UNICODE build. (Thanks to Michael Dunn for dunning this into my head).

Check function return values for errors

We're all guilty of this one. How many times have you called a function and failed to check the return value for an error? I have, thousands of times. But that's a mistake one makes less frequently as ones experience increases. In most cases, if a function can fail, it'll return an error indicator. Shouldn't your code check for that error indicator and attempt a graceful failure when it detects one? An easy example is an attempt to write to a file that's read only. Presumably the attempt to write to the file is important to the functioning of your program, so if the write fails code following the write shouldn't simply assume the write succeeded, particularly if it involves throwing away state. It's up to you to decide exactly what should be done with an error condition - but whatever you do should degrade gracefully. In the example of trying to write to a read only file one approach might be to inform the user that the file is read only and give them the option to change it to writeable. Another approach might be to to warn the user that the write failed because the file is read only and to give them the option to save to another filename. How you do it varies according to your audience - but it's important that the user be told that a problem arose, and it's important that your code knows and copes with the error.

Document what you did, and why you did it!

Toward the start of this article I stated that the most important thing you could do to ensure correct behaviour of your software was to avoid the use of pointers.  I lied. In reality the most important thing you can do to ensure correct behaviour is to document it's behaviour. Remember the contract? A contract is only as good as the medium it's recorded in. If you have to reread the source code half a year from now to winkle out some obscure side effect you're going to miss that side effect and the quality of your software will suffer. The side effect should be documented, preferably in the source code itself.

Never assume your users won't do something!

They will!  Period.  I wish I had a dollar for every time I've seen the following scenario. (I'd have about 20 bucks).

User:  When I hit ctrl+alt+print screen+pause your program crashes.

Programmer: Well why the hell did you hit those keys?

My question?  Why the hell not?  It's NOT our users fault they hit a combination of keys that crashed our software. Read that again.  It's NOT our users fault they hit a combination of keys that crashed our software.  Any time you write code that assumes your users know as much about your software as you do you're setting out on the path to disaster. They don't. And, more importantly, they don't want to!  If you've even bothered to read this far into the article you've shown that you care about code and, I hope, shown that you care about your end user experience.

Conclusion

I've presented a few traps and gotchas I've encountered in 20 or so years of professional software development. This is by no means an exhaustive list of the things that you can do to make your software bulletproof. No such list exists. But as I said in the scope, this is what works for me!

History

5 August 2004 - Initial version.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here