Introduction
VC++ compiler can create a text file that shows the assembly code
generated for a C/C++ file. I have often used this file to see the kind of code
the compiler generates. This file provides a great insight into some of the
concepts like exception handling, vtables, vbtables etc.
A very elementary knowledge of assembly language
is sufficient to
understand the output of the listing file. (Refer to
Matt Pietrek's article
on basic assembly language for a brief introduction to assembly
language.) The purpose of this article (which is one of the two articles
in the series) is to see how the listing file aids us in understanding
the inner workings of the C++ compiler.
Setting the Listing File
You can set the C/C++ compiler options to generate the listing file in Project
Settings dialog of VC6 as shown below.
In VC++.NET you can set the same option in the Project Properties dialog.
The different types of listing generated by the compiler are :-
-
Assembly code only (.asm)
-
Assembly code and machine code. (.cod)
-
Assembly code with source code. (.asm)
-
Assembly code with machine and the source code. (.cod)
Viewing the Listing File (s)
Lets examine the listing generated for the following application.
#include <stdio.h>
int main(int argc, char* argv[])
{
printf("Hello World!");
return 0;
}
1. Assembly Only Listing (/FA)
The assembly listing is place in a file with .asm extension in the intermediate
directory. For example if the file name is main.cpp then there would be a
main.asm
file in the intermediate directory. Here is a code snippet of the main function
from the listing file:-
PUBLIC _main
PUBLIC ??_C@_0N@GCDOMLDM@Hello?5World?$CB?$AA@
EXTRN _printf:NEAR
CONST SEGMENT
??_C@_0N@GCDOMLDM@Hello?5World?$CB?$AA@ DB 'Hello World!', 00H
CONST ENDS
_TEXT SEGMENT
_argc$ = 8
_argv$ = 12
_main PROC NEAR
push OFFSET FLAT:??_C@_0N@GCDOMLDM@Hello?5World?$CB?$AA@
call _printf
add esp, 4
xor eax, eax
ret 0
_main ENDP
END
Let's try to examine the listing.
-
Lines beginning with ; are comments
-
PUBLIC
_main
means that the _main
function is shared with
other files (as opposed to static functions). For static
functions there is no prefix.
-
CONST
SEGMENT
indicates beginning of a
CONST
data segment. VC++
compiler places constant data like strings in this section. So we see
that the string "Hello World" is placed in the CONST
segment. Altering any data
in the const segment would cause an access violation exception to be thrown.
More on this later.
-
_TEXT
SEGMENT
marks the beginning of another segment. The compiler places all
the code in this segment.
-
_argc$ = 8 and _argv$ = 12 indicate the stack positions of the arguments argc
and argv. In this case it means if you add 8 to the stack pointer (CPU register
ESP) you get the address of the argc parameter. The offset of 4 would be for
the return address.
-
_main
PROC
NEAR
signals the start of the function
_main
. Notice that for C
functions (functions declared with extern "C"
) the name is prefixed with _ for
C++ function the name is decorated.
-
Next we see that the compiler pushes the address of the string "Hello
World" to the stack and calls the function
printf
. After the function call is
over the stack pointer is incremented by 4 (as printf
has C calling convention
)
-
EAX is the register which hold the return value of a function. We see that EAX
is XORed with itself. (This is a quick way to set a register to zero.) This is
because our original code contains returns 0 from function main.
-
Finally the ret 0 is the instruction for returning from
the function. The numeric argument 0 following the ret
instruction indicates the amount by which the stack
pointer should be incremented.
So this was assembly only listing. Lets see how the other three listings look
like.
2. Assembly With Source Code (/FAs)
This listing gives far more clearer picture then the first one.
It shows the source text as well as the assembly code.
_TEXT SEGMENT
_argc$ = 8
_argv$ = 12
_main PROC NEAR
push OFFSET FLAT:??_C@_0M@KPLPPDAC@Hello?5World?$AA@
call _printf
add esp, 4
xor eax, eax
3. Assembly With Machine Code (/FAc)
The listing shows the instruction codes as well as the instruction
mnemonics. This listing is normally generated in a .cod file. So for this
example the listing could be seen in main.cod file.
_main _TEXT SEGMENT
_argc$ = 8
_argv$ = 12
_main PROC
NEAR
00000 68 00 00 00 00 push OFFSET
FLAT:??_C@_0M@KPLPPDAC@Hello?5World?$AA@
00005 e8 00 00 00 00 call _printf
0000a 83 c4 04 add esp, 4
0000d 33 c0 xor eax, eax
0000f c3 ret 0
_main ENDP
4. Assembly, Machine and Source Code (/FAsc)
This listing is also generated in a .cod file. As
expected this shows source code as well as the machine code with
the assembly.
_TEXT SEGMENT
_argc$ = 8
_argv$ = 12
_main PROC NEAR
00000 68 00 00 00 00 push
OFFSET FLAT:??_C@_0M@KPLPPDAC@Hello?5World?$AA@
00005 e8 00 00 00 00 call _printf
0000a 83 c4 04 add esp, 4
0000d 33 c0 xor eax, eax
0000f c3 ret 0
_main ENDP
So we see all the four types of listing generated by the compiler. In
general it is not necessary to look at the machine code. In most cases
Assembly with Source (/FAs) is the most useful listing.
Having seen the different types of listings and how to generate the
listings let's see what useful information we could gather from a listing.
Const Segment
We saw that the compiler placed the constant string "Hello World" in
CONST
segment. Let's study the implications of this through the following sample
application.
#include <stdio.h> <stdio.h>
char* szHelloWorld = "Hello World";
int main(int argc, char* argv[])
{
printf(szHelloWorld);
szHelloWorld[1] = 'o';
szHelloWorld[2] = 'l';
szHelloWorld[3] = 'a';
szHelloWorld[4] = '\'';
printf(szHelloWorld);
return 0;
}
This sample app first prints "Hello World", tries to convert the
string "Hello" to "Hola'" and finally prints altered string. Lets build
and run this application. To our amazement we see that the
application crashes with access violation exception and the line
szHelloWorld[2] = 'l';.
Let's alter the line
char* szHelloWorld = "Hello World";
to
char szHelloWorld[] = "Hello World";
The application runs successfully this time. Examining the listing
shows us why.
-
In the first case the data "Hello World" is placed in the CONST segment which
is a read only segment
CONST SEGMENT
?szHelloWorld@@3PADA DB 'Hello World', 00H
CONST ENDS
-
In the second case the data is placed in
_DATA
segment which is a read write
segment
_DATA SEGMENT
?szHelloWorld@@3PADA DB 'Hello World', 00H
_DATA ENDS
Function Inlining
One of the most useful thing that can be seen from the assembly listing is
whether a function is inlined or not. The inline directive or
_declspec(inline)
does not force the compiler to make a function
inline. There are variety of factors (mostly unknown) which determine whether a function is inlined or not. Nor does it mean that not adding inline directive before a
function would not make it inline. Assembly listing is very valuable in finding
out whether a function is inlined or not. Lets take the following example :-
void ConvertStr(char* argv)
{
szHelloWorld[1] = 'o';
szHelloWorld[2] = 'l';
szHelloWorld[3] = 'a';
szHelloWorld[4] = '\'';
}
int main(int argc, char* argv[])
{
printf(szHelloWorld);
ConvertStr(szHelloWorld);
printf(szHelloWorld);
return 0;
}
Lets examine the listing for the main function.
_main PROC NEAR
push OFFSET FLAT:?szHelloWorld@@3PADA
call _printf
push OFFSET FLAT:?szHelloWorld@@3PADA
mov BYTE PTR ?szHelloWorld@@3PADA+1, 111
mov BYTE PTR ?szHelloWorld@@3PADA+2, 108
mov BYTE PTR ?szHelloWorld@@3PADA+3, 97
mov BYTE PTR ?szHelloWorld@@3PADA+4, 39
call _printf
add esp, 8
xor eax, eax
ret 0
Note that we don't see any call instruction for ConvertStr
instead we see lots
of move BYTE PTR
instructions which modify the characters in the string (stuff
done by the ConvertStr
function). This indicates that
ConvertStr
was in fact
expanded inline.
Let's disable inline function expansion of ConvertStr
by using
_declspec(noinline)
.
_main PROC NEAR
push OFFSET FLAT:?szHelloWorld@@3PADA
call _printf
push OFFSET FLAT:?szHelloWorld@@3PADA
call ?ConvertStr@@YAXPAD@Z
push OFFSET FLAT:?szHelloWorld@@3PADA
call _printf
add esp, 12
xor eax, eax
ret 0
_main ENDP
As expected we see a call instruction to ConvertStr
function. Whether or not a
function gets inlined is up to the discretion of the compiler. In
some cases when an inline function calls another inline function only
one function is inlined. Use of #pragma
inline_depth()
and #pragma inline_recursion()
sometimes seems
to help. Again the listing file is very useful in showing whether a function was inlined or not.
Destructors
In order to examine the behavior of the destructors, let's take the
following example.
class SmartString
{
private:
char* m_sz;
public:
SmartString(char* sz)
{
m_sz = new char[strlen(sz) + 1];
strcpy(m_sz, sz);
}
char* ToStr()
{
return m_sz;
}
_declspec(noinline) ~SmartString()
{
delete[] m_sz;
}
};
int main(int argc, char* argv[])
{
SmartString sz1("Hello World");
printf(sz1.ToStr());
return 0;
}
The code generated looks like the following.
push ecx
push OFFSET FLAT:??_C@_0M@KPLPPDAC@Hello?5World?$AA@
lea ecx, DWORD PTR _sz1$[esp+8]
call ??0SmartString@@QAE@PAD@Z
mov eax, DWORD PTR _sz1$[esp+4]
push eax
call _printf
add esp, 4
lea ecx, DWORD PTR _sz1$[esp+4]
call ??1SmartString@@QAE@XZ
xor eax, eax
pop ecx
ret 0
_main ENDP
This is quite straight forward and we see clearly the call to the destructor
before the function exits. What is interesting is to see how destructors work
for array of objects. So we modify the app a little bit.
int main(int argc, char* argv[])
{
SmartString arr[2];
arr[0] = ("Hello World");
arr[1] = ("Hola' World");
printf(arr[0].ToStr());
printf(arr[1].ToStr());
return 0;
}
The last few lines of the main function now looks like following.
push OFFSET FLAT:??1SmartString@@QAE@XZ
push 2
push 4
lea eax, DWORD PTR _arr$[ebp]
push eax
call ??_I@YGXPAXIHP6EX0@Z@Z
xor eax, eax
leave
ret 0
This code is a function call to a function ??_I@YGXPAXIHP6EX0@Z@Z which in
English stands for "vector destructor iterator" (can be seen in the
listing). This function is automatically generated by the compiler. A translation of the
above assembly code in C++ would be :-
vector_destructor_iterator(arr, 2, 4,
&SmartString::SmartString);
What exactly is this "vector destructor iterator"? We know that when an array
of objects goes out of scope, destructor for each of the objects
in array is called. This is what vector destructor iterator does. Lets examine
the code of vector destructor iterator and try to reverse engineer it.
PUBLIC ??_I@YGXPAXIHP6EX0@Z@Z
_TEXT SEGMENT
___t$ = 8
___s$ = 12
___n$ = 16
___f$ = 20
??_I@YGXPAXIHP6EX0@Z@Z PROC NEAR
push ebp
mov ebp, esp
mov eax, DWORD PTR ___n$[ebp]
mov ecx, DWORD PTR ___s$[ebp]
imul ecx, eax
push edi
mov edi, DWORD PTR ___t$[ebp]
add edi, ecx
dec eax
js SHORT $L912
push esi
lea esi, DWORD PTR [eax+1]
$L911:
sub edi, DWORD PTR ___s$[ebp]
mov ecx, edi
call DWORD PTR ___f$[ebp]
dec esi
jne SHORT $L911
pop esi
$L912:
pop edi
pop ebp
ret 16
??_I@YGXPAXIHP6EX0@Z@Z ENDP
From our previous discussions we know that __t$=8 etc. denote the parameters of
the function. We already know the way function was called
from the assembly
code of the _main
function. Based on this we can figure
out the signature of the function as :-
typedef void (*DestructorPtr)(void* object);
void vector_destructor_iterator(void* _t,
int _n, int _s, DestructorPtr _f)
The function implementation can be reverse engineered to something like
void vector_destructor_iterator(void* _t, int _n, int _s, DestructorPtr _f)
{
unsigned char* ptr = _t + _s*_n;
while(_n--)
{
ptr -= size;
_f(ptr);
}
}
Basically, it calls destructor for every object in the array. Now we can
figure
out what the individual parameters mean.
vector_destructor_iterator(arr, 2, 4,
&SmartString::SmartString);
-
The first parameter is obviously the pointer to the array.
-
The second parameter us the number of elements in the array
-
The third parameter is size of the individual elements. In our case
</li>
sizeof(SmartString)
(= 4).
-
The fourth parameter is the address of the destructor function.
In this case the compiler knew exactly the number of elements in the array.
So it passes 2 as the size of array to the vector destructor iterator.
The question is what
happens in case of dynamic arrays allocated using new. In that case
the compiler cannot figure out the exact size of the array as the array
is allocated at runtime. To find that we modify the
application once again.
int main(int argc, char* argv[])
{
SmartString arr = new SmartString[2];
arr[0] = "Hello World";
arr[1] = "Hola' World";
printf(arr[0].ToStr());
printf(arr[1].ToStr());
delete [] arr;
return 0;
}
The assembly listing now looks like:-
push 3
mov ecx, esi
call ??_ESmartString@@QAEPAXI@Z
pop edi
xor eax, eax
pop esi
We see that there is a new function which has come into picture -
??_ESmartString@@QAEPAXI@Z. The function in English would be called "vector
deleting destructor". The assembly code for the vector deleting destructor :-
PUBLIC ??_ESmartString@@QAEPAXI@Z
_TEXT SEGMENT
___flags$ = 8
??_ESmartString@@QAEPAXI@Z PROC NEAR
push ebx
mov bl, BYTE PTR ___flags$[esp]
test bl, 2
push esi
mov esi, ecx
je SHORT $L896
push edi
push OFFSET FLAT:??1SmartString@@QAE@XZ
lea edi, DWORD PTR [esi-4]
push DWORD PTR [edi]
push 4
push esi
call ??_I@YGXPAXIHP6EX0@Z@Z
test bl, 1
je SHORT $L897
push edi
call ??3@YAXPAX@Z
pop ecx
$L897:
mov eax, edi
pop edi
jmp SHORT $L895
$L896:
mov ecx, esi
call ??1SmartString@@QAE@XZ
test bl, 1
je SHORT $L899
push esi
call ??3@YAXPAX@Z
pop ecx
$L899:
mov eax, esi
$L895:
pop esi
pop ebx
ret 4
??_ESmartString@@QAEPAXI@Z ENDP
_TEXT ENDS
This function has a __thiscall
calling convention meaning the first parameter
this is put in ECX. This is the pseudo calling convention used to
invoke member functions. The C++ pseudo code for this would be :-
void SmartString::vector_deleting_destructor(int flags)
{
if (flags & 2)
{
int numElems = *((unsigned char*)this - 4);
vector_destructor_iterator(this, numElems,
4, &SmartString::SmartString);
}
else
{
this->~SmartString();
}
if (flags & 1)
delete ((unsigned char*)this - 4);
}
So we see that the number of elements is stored just before the first element of
the array. This means that the new[]
operator should allocate extra 4 bytes.
Looking at the assembly generated for the new[]
call confirms this.
push 12
call ??2@YAPAXI@Z
The new operator takes the size to be allocated as a parameter. We see that the
number 12 is pushed on the stack. The size of SmartString is only 4 bytes, and
the total size of two elements is 8 bytes. So the new operator does allocate
extra 4 bytes to mark the number of elements in the array. In order to further
confirm this overload the operator new[]
in SmartString. It can be seen that the
amount of memory requested is always 4 bytes more than the actual memory
required to store the array.
Constructors
Lets see the assembly code for the section where the new operator is called.
push 12
call ??2@YAPAXI@Z
test eax, eax
pop ecx
je SHORT $L980
push 2
pop ecx
push OFFSET FLAT:??0SmartString@@QAE@XZ
push ecx
lea esi, DWORD PTR [eax+4]
push 4
push esi
mov DWORD PTR [eax], ecx
call ??_H@YGXPAXIHP6EPAX0@Z@Z
jmp SHORT $L981
$L980:
xor esi, esi
$L981:
The ??_H@YGXPAXIHP6EPAX0@Z@Z function is "vector constructor iterator" something
similar to vector destructor iterator.
Translated in pseudo C++ code this would look like
unsigned char* allocated =
new unsigned char[12];
if (allocated != NULL)
{
*(int*)allocated = 4;
vector_constructor_iterator(allocated + 4,
4, 2, &SmartString::SmartString);
}
The vector constructor iterator works in the same way as vector destructor iterator. It calls
the constructor for all elements in the array.
Exceptions
In all my examples I have disabled exception handling before compiling the
application. Enabling exception handling causes lot of extra code to be emitted
by the compiler. This has been described in great detail in by
Vishal
Kocchar.
Calling Conventions
Nemja Trifunjovic describes calling conventions in great detail in
his article.
Conclusion
I have tried to examine some aspects of inner workings of the C++ compiler using
assembly listing. Examining the assembly listing gives a clear picture of what
compiler does under the hood with the C++ code. This will help us in writing better and
efficient C++ code. There are lots of other things that can be found out from
the assembly listing. In the next article I will discuss how the
compiler implements vftables, vbtables and
RTTI.