In this chapter we will deal with class, structures and objects.
So class or struct is the keywords we use to create a class in C++. The class can hold static, non static member variables. Class can contain static, non static and virtual member functions. To have a detailed look at all the possible representations of the class members please refer Inside the C++ object model. We will not go through them all. We will only discuss what the current trends are (The trend in Clang).
Lets take an example:
class Point{
private:
float _x;
float _y;
public:
Point(){_x = 0; _y = 0;}
Point(const float x, const float y):_x(x), _y(y){}
float x() const{ return _x;}
float y() const{ return _y;}
float x() { return _x;}
float y() { return _y;}
};
int main(){
Point p;
return 1;
}
The assembly generate is as below. Based on your clang setting you can get more or less code in your ll file. I have removed all the code that do not pertain to our discussion at hand or that do not add any value to our discussion. For example we will ignore “#0″. This is basically attribute. A group of attributes is referenced with a “#<Number>”.
%class.Point = type { float, float }
define i32 @main() {
entry:
%p = alloca %class.Point, align 4
call x86_thiscallcc void @_ZN5PointC2Ev(%class.Point* %p)
ret i32 1
}
; Function Attrs: nounwind
define linkonce_odr x86_thiscallcc void @_ZN5PointC2Ev(%class.Point* %this) unnamed_addr #1 align 2 {
entry:
%this.addr = alloca %class.Point*, align 4
store %class.Point* %this, %class.Point** %this.addr, align 4
%this1 = load %class.Point** %this.addr
%_x = getelementptr inbounds %class.Point* %this1, i32 0, i32 0
store float 0.000000e+00, float* %_x, align 4
%_y = getelementptr inbounds %class.Point* %this1, i32 0, i32 1
store float 0.000000e+00, float* %_y, align 4
ret void
}
Let us dissect the above assembly one by one.
Line 1 is the way LLVM assembly represents data collection. The structure type is used to represent a collection of data members together in memory. The elements of a structure may be any type that has a size.
Example:
%T1 = type { <type list> }
Similarly in our example the class A; is %class.Point = type { float, float }
Now going to the “main” function. As you can guess the “align” specifier is used to specify alignment. Below is the main code commented. In llvm the comments are starting with a “;”.
define i32 @main() #0 {
entry:
%p = alloca %class.Point, align 4 ; Allocate memory for our class Point instance
call x86_thiscallcc void @_ZN5PointC2Ev(%class.Point* %p) ; Constructor call. The name is mangled.
ret i32 1
}
Here in the code we see the constructor as a different function. So how does it know about the object members?
Well this is the job of the this
pointer which is passed to the constructor as given below
@_ZN5PointC2Ev(%class.Point* %this) //The this pointer passed to the class
Now why is the constructor call a must? Lets see inside the constructor:
- We allocate memory to store a pointer by
%this.addr = alloca %class.Point*, align 4
. - The “store” instruction has the format “store type value, type* destination address”. So
store %class.Point* %this, %class.Point** %this.addr, align 4
. - The argument to the “load" instruction specifies the memory address from which to load. So
%this1 = load %class.Point** %this.addr
. Loads the content pointed to by %this.addr on this1. This is basically the this
pointer. - The ‘getelementptr‘ instruction is used to get the address of a subelement of an aggregate data structure. It performs address calculation only and does not access memory. So
%_x = getelementptr inbounds %class.Point* %this1, i32 0, i32 0
fetches the address of the “_x”. - Now
store float 0.000000e+00, float* %_x, align 4
will store the 0 float value in this element.
Empty Class
Empty class or structures are structures without any members or member functions. So what happens when we have these kind of constructs? Is there any memory allocated for them? If yes then why? Lets explore this concept.
Consider generating the AST for the below 2 constructs
struct emptyS{};
class emptyC{};
TranslationUnitDecl 0xbe0bc0 <<invalid sloc>> <invalid sloc>
|-TypedefDecl implicit __builtin_va_list 'char *'
|-CXXRecordDecl <test.cpp:1:1, col:15> col:8 struct emptyS definition
| `-CXXRecordDecl <col:1, col:8> col:8 implicit struct emptyS
`-CXXRecordDecl <line:2:1, col:14> col:7 class emptyC definition
`-CXXRecordDecl <col:1, col:7> col:7 implicit class emptyC
From the above code we observe that the test.cpp is the translation unit.
We have a definition of the class as "CXXRecordDecl <line:2:1, col:14> col:7 class emptyC definition"
the 2nd CXXRecordDecl is a implicit class name inserted into the C++ class namespace as described by C++ standard.
Now lets generate the IR for this.
struct emptyS{};
class emptyC{};
int main(){
emptyS s;
emptyC c;
return 1;
}
%struct.emptyS = type { i8 }
%class.emptyC = type { i8 }
define i32 @main() {
entry:
%retval = alloca i32, align 4
%s = alloca %struct.emptyS, align 1
%c = alloca %class.emptyC, align 1
store i32 0, i32* %retval
ret i32 1
}
Now though the class and struct have no data members still their size is non zero.
Reason: It is nonzero to ensure that the two different objects will have different addresses. When we allocate memory for them then
%s = alloca 0 and %c = alloca 0 will point to the same memory location.
Unions
A union is a class defined with the class-key union; it holds only one data member at a time. class-key here means the key words class/struct/union.
Now lets see how the memory layout of a a union looks like. Comments are inline. Based on your machine you can get different sizeof values. Here we see that the double is having largest size. So the union contains double only.
typedef void (*FunPtrType)(void);
union U{
int _i;
float _f;
char _c;
double _d;
void* _p;
FunPtrType _fp;
};
int main(){
int sizeInt = sizeof(int);
int sizefloat = sizeof(float);
int sizeChar = sizeof(char);
int sizeDouble = sizeof(double);
int sizeV = sizeof(void*);
int sizeFP = sizeof(FunPtrType);
U u;
return 1;
}
%union.U = type { double }
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%sizeInt = alloca i32, align 4
%sizefloat = alloca i32, align 4
%sizeChar = alloca i32, align 4
%sizeDouble = alloca i32, align 4
%sizeV = alloca i32, align 4
%sizeFP = alloca i32, align 4
%u = alloca %union.U, align 8
store i32 0, i32* %retval
store i32 4, i32* %sizeInt, align 4
store i32 4, i32* %sizefloat, align 4
store i32 1, i32* %sizeChar, align 4
store i32 8, i32* %sizeDouble, align 4
store i32 4, i32* %sizeV, align 4
store i32 4, i32* %sizeFP, align 4
ret i32 1
}
<Previous
Bibliography:
- Name mangling in Itanium.