Include Guards, Pragma Once, Predeclarations and other hints that might be useful when dealing with includes.
What can we do with file including in C++? Do we need to put all the other header files of the project (plus third party libraries) in every file all the time? For sure, there have to be some rules to manage that issue properly.
The issue covered in this blog post is of course nothing new. Every CPP programmer should know how to use #include
correctly. But somehow, I still see lots of code where there is a mess and compile times takes too much time... What is worse (as it is in most other cases), even if you try for some time to use some good #include
policy, after a while chaos still lurks from the files. I am of course also responsible for such mistakes.
What is the Problem?
Why is this so important to minimize amount of header files and include
statements?
Here is a generic picture:
Do you see the answer here? Of course, your program may have far more complex structure, so add another 100 files and connect them randomly.
CPP compiler's work regarding header files:
- Read all the header files (open a file, read its content, list errors if occurred)
- Pump headers' content into one translation unit
- Parse and obtain the logical structure of the code inside a header
- old C macros need to be run, this might even change the final structure of a file
- templates instantiation
- a lot of play with the
string
s in general
If there is too much redundancy, the compiler needs to work considerably longer.
Any Guidelines?
- Forward declarations everywhere!
- Try to use them wherever you can. This will reduce number of include files. Please notice that where some type is needed (in a function, as a class member) maybe include file is not so crucial for the compiler - it needs to know only its name, not full definition.
- Header order
- File myHeader.h (containing some classes) should be included first (or just after common precompiled header) and self containing. That means when we use myHeader.h somewhere else in the project, we do not have to know what are its additional include dependencies.
- Speed
- Modern compilers are pretty good in optimizing access to header files. But some additional help from our side can be good.
- Precompiled headers can be a life and time saver. Put as many system and third party libraries header files as you can. Unfortunately, things can go nasty when you need multiplatform solution and when you include too much. Read more here: gamesfromwithin
- Pragma Once, Include Guards and Redundant Include Guards: there is no clear winner in choosing what combination is the best. In Visual Studio, PragmaOnce seems to be superb, but it is not a standardized solution. For instance, GCC is usually better with standard Include Guards.
- Tools
- Find whatever tool you like and generate dependency graphs for particular CPP file.
- One quick tool that might be useful is Visual Studio's option
/showincludes
(link) that (as name suggests) prints all includes that goes into a CPP source code. If the list is too long, maybe it is good to look at particular file. In GCC, there is an even more advanced option -M
(link) that shows dependency graph.
As we see, we can reduce substantially number of includes by using pointers or references for members or arguments declarations. In general, we should have only a minimal set of files that is needed to compile the file. It is even possible to reduce this number to zero.
Ideally:
#ifndef _HEADER_A_INCLUDED_H
#define _HEADER_A_INCLUDED_H
class A
{
};
#endif // _HEADER_A_INCLUDED_H
And in the source file:
#include <stdafx.h> // precompiled if needed
#include "A.h"
#include "..." // all others
Is There a Hope?
Header files can be very problematic and it is, definitely, not a great feature of the C++ language. If you include too much, your compile time can grow and grow. And it is not so easy to control it. But what are the other options? How do other languages handle a similar issue?
It is hard to compare compilation of Java and C# to C++: C++ produces native binary code that is optimized for the particular architecture. Managed languages compile to some form of intermediate language that is much easier that the native code. It is worth mentioning the fact that managed languages uses modules (not includes) that are almost final version of the compiled code. That way, compiler does not have to parse a module again and again. It just grabs needed and compiled data and metadata.
So it seems that lack of modules is the main problem for C++. This idea would reduce translation unit creation time, minimize redundancy. I have already mentioned it some time ago: modules in CPP via clang (here or here). On the other hand, C++ compilation is very complex and thus it is not so easy to introduce and what is more important, standardize the module concept.
Links
- Link to interesting (and more general) question: why-does-c-compilation-take-so-long
- Large Scale C++ Software Design by John Lakos - I mentioned it in my previous post about insulation. In the book, there are detailed discussions about the physical structure of a C++ code. Suggested reading for all CPP programmers.
- even-more-experiments-with-includes- @Games From Within
- RedundantIncludeGuards - A simple technique where before including something, you simply check if its include guard is already defined. In older compilers, it could give a performance boost, but in modern solutions, the benefit of using it is not that visible.
To Be Continued...
In the near future, I will try to post some benchmarks here regarding compilation time and #include
tricks.