Introduction
In C++, each name (of a template, type, function, or object) is subjected to the one definition rule (ODR): it can be defined only once. This article covers some issues with ODR, provides some recommendations and solutions to the problems that may occur, particularly when using constants and constant expressions in inline functions. The first five examples can be compiled using the following compilers: GCC C++ 4.8.1, the Microsoft Visual C++ Compiler Nov 2013 CTP, GCC C++ 4.9 and Clang 3.4. The last example can be compiled only in GCC C++ 4.9 and Clang 3.4.
The One Definition Rule
In the coming C++14 standard (as well as in the present C++11 standard) the one definition rule states (see [1], section 3.2):
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template.
Then the standard states that:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements.
The requirements are that each definition should be the same, meaning it should consist of the same tokens and refer to the same items. Normally we put such global definitions into the header files; we put declarations into header files. Other definitions (for example, non-inline functions) usually should go inside source files (cpp-files).The problem is that some of the identifiers defined on top level can have global linkage, which means that the same definition will be available across the whole program; others have local linkage, which means that they may have a different definition is each translation unit. The exact definition in the standard is as follows:
A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope:
- When a name has external linkage , the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit.
- When a name has internal linkage , the entity it denotes can be referred to by names from other scopes in the same translation unit.
- When a name has no linkage , the entity it denotes cannot be referred to by names from other scopes.
Unfortunately, when ODR is violated, no diagnostic is required: the program will compile and run with some surprising effects. We will be looking at some of those effects below.
Inline Functions
The reason we define inline functions is to inform the compiler that each call to such a function can be replaced by its body with formal parameters replaced with its arguments (actual parameters), which usually makes the function perform faster because all the jumps to the code that executes the body and all the returns back are eliminated. But the actual program can become more bloated.
If the function is defined as inline in does not necessarily mean that the compiler will "agree" to make it inline: it may still be compiled as an ordinary function. If the function is not inlined and is used in several modules, it's better to have only one code for it (in one module), rather than provide code in each module where it is used. In this case the linker will make sure that all the modules will reference its body correctly.
The recommendation is, by default, that an inline function has external linkage. Although it is possible to define it explicitly static
or put it inside an unnamed namespace.
Violation of ODR For Classes and Inline Functions
Let's consider the following sample program that consists of two modules:
#include <iostream>
void module1_print();
inline int f1()
{
return 4;
}
class A
{
public:
static double f()
{
return 4.1;
}
};
const double C = 4.2;
constexpr double E = 4.5;
void print()
{
std::cout << "main f1(): " << f1() << std::endl;
std::cout << "main A::f(): " << A::f() << std::endl;
std::cout << "main C: " << C << std::endl;
std::cout << "main E: " << E << std::endl;
}
int main()
{
module1_print();
print();
int i;
std::cin >> i;
}
#include <iostream>
inline int f1()
{
return 3;
}
class A
{
public:
static double f()
{
return 3.1;
}
};
const double C = 3.2;
constexpr double E = 3.5;
void module1_print()
{
std::cout << "module1 f1(): " << f1() << std::endl;
std::cout << "module1 A::f(): " << A::f() << std::endl;
std::cout << "module1 C: " << C << std::endl;
std::cout << "module1 E: " << E << std::endl;
}
If we compile it with the Visual C++ Compiler Nov 2013 CTP, we will get the following output:
module1 f1(): 4
module1 A::f(): 4.1
module1 C: 3.2
module1 E: 3.5
main f1(): 4
main A::f(): 4.1
main C: 4.2
main E: 4.5
You may notice that, in both module1 and the main module, the function f1() and A::f() produce the same results: 4 and 4.1 respectively, although they are "expected" to produce different results in module1: 3 and 3.1. The problem is that this program violates ODR. The function f() and the class A should have the same definitions in both translation units (modules).
If we use g++ 4.9 or clang 3.4 compilers:
g++ std=c++1y odr_test1.cpp module.cpp –o run1
clang++ std=c++1y odr_test1.cpp module.cpp –o run1
we will get the same results as in Visual C++. But if we compile the modules in a different order:
g++ std=c++1y module.cpp odr_test1.cpp –o run2
clang++ std=c++1y module.cpp odr_test1.cpp –o run2
the results will be different:
module1 f1(): 3
module1 A::f(): 3.1
module1 C: 3.2
module1 E: 3.5
main f1(): 3
main A::f(): 3.1
main C: 4.2
main E: 4.5
The preference is given to the module that come first. But, as I mentioned, no diagnostic is given, the program compiles perfectly.
Let's look at the const
and constexpr
definitions: they both produce expected results. Why? Because, in C++, constants and constant expressions have local linkage. Their definitions are not global. You may define constants and constant expressions with the same names in different modules with different values and they will be treated correctly.
What shall we do with f1()
and A::f()
to correct out program? The solution is simple: we must put them into an unnamed namespace, which is often called an anonymous namespace. Here is a revised version of our program (I have deliberately renamed the files and the external function):
#include <iostream>
void module2_print();
namespace
{
inline int f1()
{
return 4;
}
class A
{
public:
static double f()
{
return 4.1;
}
};
}
const double C = 4.2;
constexpr double E = 4.5;
void print()
{
std::cout << "main f1(): " << f1() << std::endl;
std::cout << "main A::f(): " << A::f() << std::endl;
std::cout << "main C: " << C << std::endl;
std::cout << "main E: " << E << std::endl;
}
int main()
{
module2_print();
print();
int i;
std::cin >> i;
}
#include <iostream>
namespace
{
inline int f1()
{
return 3;
}
class A
{
public:
static double f()
{
return 3.1;
}
};
}
const double C = 3.2;
constexpr double E = 3.5;
void module2_print()
{
std::cout << "module2 f1(): " << f1() << std::endl;
std::cout << "module2 A::f(): " << A::f() << std::endl;
std::cout << "module2 C: " << C << std::endl;
std::cout << "module2 E: " << E << std::endl;
}
If you run this program you'll get the expected results:
module2 f1(): 3
module2 A::f(): 3.1
module2 C: 3.2
module2 E: 3.5
main f1(): 4
main A::f(): 4.1
main C: 4.2
main E: 4.5
In practice, if you define classes or inline functions in cpp-files, make sure that you put them inside an unnamed namespace (or some other namespace, which is note defined globally -- across several modules). Alternatively, for inline functions, it is possible to use the keyword static
.
Effects of Using Constants and Constant Expressions Inside Inline Functions
As it was mentioned in the previous section, constants and constant expressions (in contrast to variables) have local linkage. Since they normally don't change we often do not care, which definition we use in an inline function (it has global linkage by default). But there are cases, when it might have surprising, unexpected effects. Let's look at the following program:
#ifndef HEADER3_H
#define HEADER3_H
const double C = 3.2;
constexpr double E = 3.5;
inline const double& RefC()
{
return C;
}
inline const double& RefE()
{
return E;
}
void module3_print();
#endif // HEADER3_H
#include <iostream>
#include "header3.h"
void print()
{
std::cout << "main C: " << C << std::endl;
std::cout << "main E: " << E << std::endl;
std::cout << "main address(C): " << std::hex << (unsigned long long)(&C) << std::endl;
std::cout << "main address(E): " << std::hex << (unsigned long long)(&E) << std::endl;
std::cout << "main address(RefC()): " << std::hex << (unsigned long long)(&RefC()) << std::endl;
std::cout << "main address(RefE()): " << std::hex << (unsigned long long)(&RefE()) << std::endl;
}
int main()
{
module3_print();
print();
int i;
std::cin >> i;
}
#include <iostream>
#include "header3.h"
void module3_print()
{
std::cout << "module3 C: " << C << std::endl;
std::cout << "module3 E: " << E << std::endl;
std::cout << "module3 address(C): "
<< std::hex << (unsigned long long)(&C) << std::endl;
std::cout << "module3 address(E): "
<< std::hex << (unsigned long long)(&E) << std::endl;
std::cout << "module3 address(RefC()): "
<< std::hex << (unsigned long long)(&RefC()) << std::endl;
std::cout << "module3 address(RefE()): "
<< std::hex << (unsigned long long)(&RefE()) << std::endl;
}
This program will print something like this (the actual addresses may be different):
module3 C: 3.2
module3 E: 3.5
module3 address(C): b2de48
module3 address(E): b2de50
module3 address(RefC()): b2db38
module3 address(RefE()): b2db40
main C: 3.2
main E: 3.5
main address(C): b2db38
main address(E): b2db40
main address(RefC()): b2db38
main address(RefE()): b2db40
The main issue is that in module3
there is a difference between the addresses of C and E obtained directly and through the inline functions. Here I used the Visual C++ Compiler Nov 2013 CTP. But similar effects will be in GCC 4.9 and Clang 3.4.
The problem is that the inline function (&RefC()
or &RefE()
) refers to only one copy of the constant or constant expression across the whole program, whereas the explicit access to the address (through &C
or &E
) uses the object defined in the corresponding module.
This does not seem like a big problem: we rarely access addresses of constants. But let's look and the next example, which uses mutable members:
#ifndef HEADER4_H
#define HEADER4_H
struct ClassC
{
mutable double z = 10.1;
ClassC() {}
};
const ClassC C;
struct ClassE
{
mutable double z;
constexpr ClassE(double z1) : z(z1) {}
};
constexpr ClassE E(7.5);
inline const ClassC& RefC()
{
return C;
}
inline const ClassE& RefE()
{
return E;
}
void module4_print();
#endif // HEADER4.h
#include <iostream>
#include "header4.h"
void print()
{
std::cout << "main C.z: " << C.z << std::endl;
std::cout << "main E.z: " << E.z << std::endl;
C.z = 4.1;
std::cout << "main RefC().z: " << RefC().z << std::endl;
std::cout << "main C.z: " << C.z << std::endl;
E.z = 4.9;
std::cout << "main RefE().z: " << RefE().z << std::endl;
std::cout << "main E.z: " << E.z << std::endl;
}
int main()
{
module4_print();
print();
int i;
std::cin >> i;
}
#include <iostream>
#include "header4.h"
void module4_print()
{
std::cout << "module4 C.z: " << C.z << std::endl;
std::cout << "module4 E.z: " << E.z << std::endl;
C.z = 3.1;
std::cout << "module4 RefC().z: " << RefC().z << std::endl;
std::cout << "module4 C.z: " << C.z << std::endl;
E.z = 3.9;
std::cout << "module4 RefE().z: " << RefE().z << std::endl;
std::cout << "module4 E.z: " << E.z << std::endl;
}
This program will print:
module4 C.z: 10.1
module4 E.z: 7.5
module4 RefC().z: 10.1
module4 C.z: 3.1
module4 RefE().z: 7.5
module4 E.z: 3.9
main C.z: 10.1
main E.z: 7.5
main RefC().z: 4.1
main C.z: 4.1
main RefE().z: 4.9
main E.z: 4.9
The value of RefC().z
is 10.1 despite our effort to change the field C.z
inside module4
:
E.z = 3.1;
The value of RefE().z
is 7.5 despite our effort to change the field E.z
inside module4
:
E.z = 3.9;
The issue is that the inline function definitions are merged together, which means that an inline function returns only the reference to one object (in this example, it is the objects C
and E
defined in the main module).
The issue that, since the inline function has global linkage, it would be more appropriate to provide global linkage for the constant expression as well.
The solution is to provide one definitions for C
and E
in one module and then refer to them from the other module. In order to define C
in one module, we can put one of the following two lines:
(1) const ClassC C;
(2) extern ClassC C{};
They both provide the correct definition for the constant C with a default constructor. On the other hand, the following lines will not work:
(3) const ClassC C();
(4) extern const ClassC C;
In the header file, we must put the following declaration:
extern const ClassC C;
For the constant expression E, there are two methods. The first approach is to define a constexpr
object in one module and then define a constant reference to it. In the module (we'll call it module5
), we can write:
constexpr ClassE E_internal(7.5);
const ClassE& E = E_internal;
In the header, we will put the following line instead of the constexpr
definition:
extern const ClassE& E;
This will allow all the modules to access the same reference to the constexpr object. The program will output:
module5 C.z: 10.1
module5 E.z: 7.5
module5 RefC().z: 3.1
module5 C.z: 3.1
module5 RefE().z: 3.9
module5 E.z: 3.9
main C.z: 3.1
main E.z: 3.9
main RefC().z: 4.1
main C.z: 4.1
main RefE().z: 4.9
main E.z: 4.9
The main difference is in the following four lines of the output:
module5 RefC().z: 3.1
module5 C.z: 3.1
module5 RefE().z: 3.9
module5 E.z: 3.9
which means that the inline functions give the same results as the direct access to members.
Another approach is defining the constant expression with global linkage and accessing is directly from other module. But it is supported at present only by GCC 4.9 and Clang 3.4. We use the following definition in one of the modules:
extern constexpr ClassE E(7.5);
The issue is how to access it from other modules. The standard does not explicitly define it, but mentions that a constexpr
object is a const
object. In the header we can write:
extern const ClassE E;
Here is the full version of odr_test6
program:
#ifndef HEADER6_H
#define HEADER6_H
struct ClassC
{
mutable double z = 10.1;
ClassC() {}
};
extern const ClassC C;
struct ClassE
{
mutable double z;
constexpr ClassE(double z1) : z(z1) {}
};
extern const ClassE E;
inline const ClassC& RefC()
{
return C;
}
inline const ClassE& RefE()
{
return E;
}
void module6_print();
#endif // HEADER6.h
#include <iostream>
#include "header6.h"
void print()
{
std::cout << "main C.z: " << C.z << std::endl;
std::cout << "main E.z: " << E.z << std::endl;
C.z = 4.1;
std::cout << "main RefC().z: " << RefC().z << std::endl;
std::cout << "main C.z: " << C.z << std::endl;
E.z = 4.9;
std::cout << "main RefE().z: " << RefE().z << std::endl;
std::cout << "main E.z: " << E.z << std::endl;
}
int main()
{
module6_print();
print();
int i;
std::cin >> i;
}
#include <iostream>
#include "header6.h"
const ClassC C;
extern constexpr ClassE E(7.5);
void module6_print()
{
std::cout << "module6 C.z: " << C.z << std::endl;
std::cout << "module6 E.z: " << E.z << std::endl;
C.z = 3.1;
std::cout << "module6 RefC().z: " << RefC().z << std::endl;
std::cout << "module6 C.z: " << C.z << std::endl;
E.z = 3.9;
std::cout << "module6 RefE().z: " << RefE().z << std::endl;
std::cout << "module6 E.z: " << E.z << std::endl;
}
And this program will print:
module6 C.z: 10.1
module6 E.z: 7.5
module6 RefC().z: 3.1
module6 C.z: 3.1
module6 RefE().z: 3.9
module6 E.z: 3.9
main C.z: 3.1
main E.z: 3.9
main RefC().z: 4.1
main C.z: 4.1
main RefE().z: 4.9
main E.z: 4.9
References
[1] Working Draft, Standard for Programming Language C++,
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf