Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C

What Does The Preprocessor Do?

5.00/5 (6 votes)
4 Nov 2014CPOL4 min read 20K  
What does the preprocessor do?

What a jolly good question.

The preprocessor takes a look at your source code just before it goes off to the compiler, does a little formatting, and carries out any instructions you have given it.

Like what?

Well, preprocessor instructions are called preprocessor directives, and they all start with a #.

Like #include?

Exactly.

Each # command that the preprocessor encounters results in a modification to the source code in some way. Let’s take a look at them briefly in turn, and then we’ll see what goes on behind the scenes.

#include

Includes header files for other libraries, classes, interfaces, etc. The preprocessor actually copies the entire header into your source file* (yes, that’s why inclusion guards are such a good thing).

#define

Who doesn’t love macros! The preprocessor replaces all instances of the definition with the code that is defined. The definition holds unless an #undef directive is found for that definition.

#ifdef

Conditional behaviour that tells the preprocessor to include code within the conditional declaration IF the condition is met. You can use these just like if-else statements, choosing from: #ifdef, #ifndef, #if, #else, and #elif, and you always need to finish with an #endif.

#error #warning

Used for sending messages to the user. The preprocessor stops on #error, but not on #warning. In both cases, it sends any string it finds after the directive (in quotes please), to the screen as output, so they are handy ways to ensure everything is set up correctly for your platform.

#line

Used to alter the line number and filename displayed when you encounter compilation errors. If, for example, you need to refer back to a certain source file from compilation of an intermediate file (that is possibly auto-generated).

#pragma

Other specific directives interpreted by the compiler. Your compiler documentation will tell you what pragmas are available and you should never assume that they will be available globally.

#assert #unassert

These were eternally popular in older programs (well, the ones I’ve worked on at least), but they are now considered obsolete. Their use is strongly discouraged, which means don’t put them in new code. ;-)

Predefined Macros

There are a number of predefined macros available for use:

  • __FILE__ Gives the filename as a string
  • __LINE__ Gives the current line number (as an integer)
  • __DATE__ The compile date as a string
  • __TIME__ The compile time as a string
  • __STDC__ Compiler dependent, but usually defined as 1 to indicate compliance with the ISO C standard.
  • __cplusplus Always defined when compiling a C++ program

The first two in particular are really useful in debugging. Just pop them in and magically you get informative output without having to write your own file and line processing class.

Your compiler may support other macros. For example, the full list (for GCC) can be found here.

So What Actually Happens When You Run the Preprocessor?

  1. Replace all trigraphs. I’ll actually talk about this in a future post, because although it’s effectively a historical feature (and you have to switch it in GCC), it’s still quite interesting.
  2. Concatenate source code split over multiple lines.
  3. Remove each comment and replace with a space.
  4. Deal with preprocessor directives (those we talked about above). For #include, it recursively carries out steps 1 -3 on the new file. :-)
  5. Process any escape sequences.
  6. Pass the file to the compiler.

If you want to see what your file looks like after preprocessing (and who doesn’t?), you can pass gcc the -E option. This will send the preprocessed source code to stdout and then stop execution without compiling or linking.

e.g.

g++ -E myfile.cpp

Or, you can use the compile flag:

-save-temps

To compile as usual but to keep a copy of the temporary files.

For example, let’s take a simple program:

C++
#include <stdio.h>

#define ONE 1
#define TWO 2

int main()
{
    printf("%d, %d\n", ONE, TWO);
    return 0;
}

And then compile it with:

g++ hello.cpp -save-temps

When compilation is finished, you’ll have two additional files in your directory: hello.s and hello.ii.

hello.s contains assembler instructions and hello.ii contains your source with the preprocessing completed.

If you look at hello.ii in a text editor, you’ll see that it has a LOT of code in it. That’s because you used an #include directive to pull in the stdio header.

Even better, if you scroll right to the bottom, you can also see that the preprocessor has replaced the ONE and TWO macros in the printf statement with the actual definitions, 1 and 2.

Awesome!

*Actually, it makes a temporary copy of your source file and expands out all the directives that it finds into that copy. The file is deleted after use, so ordinarily you would never know it existed.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)