Introduction
Hi! In this article, I will try to explain all the calling conventions, their advantages and disadvantages. I will also explain why in C we can pass variable number of arguments and not in C++, and also cover fast call.
Background
Knowledge of assembly coding will be helpful but I will try to explain as much as I can, so don't worry and just relax.
Using the code
So you all must have come across the famous words, "Calling convention", in your career till now. If not, be prepared for runtime error such as value of ESP not saved properly across function calls. So what is all this about calling convention? Don't worry, all your doubts will be cleared. Why C can pass variable number of arguments to a function and C++ cannot? what is fast call? So, ready to go...
We will start by Cdecl calling convention or the C Declaration style. Just have a look at the simple code below:
#include "stdio.h"
int __cdecl Function(int a, int b, int c)
{
return a + b -c;
}
void main()
{
int r = Function(1, 2, 3);
}
Just a simple function and a main
. Now, we will start with assembly. For those who don't know assembly, don't worry, it's not a big deal thanks to Microsoft. We will use Dumpbin utility that ships with VC++ to disassemble the code and get the assembly. First compile the code using the command line: cl Cdecltest.c, and the output will be the obj and exe. We will now use the Dumpbin utility on command line:
dumpbin /disam Cdecltest.obj > Cdecltest.txt
and redirect the output to a text file. The output is:
Dump of file CdeclTest.obj
File Type: COFF OBJECT
_Function:
00000000: 55 push ebp
00000001: 8B EC mov ebp,esp
00000003: 8B 45 08 mov eax,dword ptr [ebp+8]
00000006: 03 45 0C add eax,dword ptr [ebp+0Ch]
00000009: 2B 45 10 sub eax,dword ptr [ebp+10h]
0000000C: 5D pop ebp
0000000D: C3 ret
_main:
0000000E: 55 push ebp
0000000F: 8B EC mov ebp,esp
00000011: 51 push ecx
00000012: 6A 03 push 3
00000014: 6A 02 push 2
00000016: 6A 01 push 1
00000018: E8 00 00 00 00 call 0000001D
0000001D: 83 C4 0C add esp,0Ch
00000020: 89 45 FC mov dword ptr [ebp-4],eax
00000023: 8B E5 mov esp,ebp
00000025: 5D pop ebp
00000026: C3 ret
where the first two lines of main
are known as prolog and always have Opcode 55 8B EC. While passing the parameters to the function from the main
, we are pushing it on the stack. Arguments are pushed from right to left, i.e., push 3
, push 2
and push 1
, and finally the return address where it has to return after executing the function. Now take a look at the stack diagram:
Initially, the stack pointer is at 98H, i.e., high memory location. As we go on push
ing, the stack becomes as shown above. Now the problem begins in calling function. If we pop
three entries assuming we have push
ed three arguments, then it will pop
the return address and then it wont know at what address to return. So have a look at the assembly code of the function call and try to understand how it pop
s element without disturbing the return address. As seen, it first pushes its ebp
(base pointer) at address 80H, then moves the stack pointer in ebp
.
now look at instruction mov eax,dword ptr [ebp+8]
, i.e., nothing but address 88H i.e., the pushed element 1 then next instruction is add eax,dword ptr [ebp+0Ch]
, i.e., ebp+0Ch gives second element i.e., 2 and addition of them similarly it gets the third element. Then it pops Ebp
at address 80Hi.e stack becomes as in figure 1 and then it returns which is the correct return address i.e., 84H.
This is all the assembly programming we needed. So once again, back to our main discussion. Sorry for diverting you from the track. But now a problem is that the stack is not cleaned, i.e., we haven't popped the three elements pushed on the stack. So someone needs to clean it up, who will it be......????? Back to the assembly... in the main
, after the call, there is an instruction add esp,0Ch
which cleans the stack 0CH since three elements were pushed. If two elements would be pushed, it will be... you got it right... add esp,08h
.
That's it. So in CDecl calling convention, the main one who calls the function has the responsibility of cleaning the stack.
StdCall (std calling convention __stdcall)
#include "stdio.h"
int __stdcall Function(int a, int b, int c)
{
return a + b -c;
}
void main()
{
int r = Function(1, 2, 3);
}
And similarly, let's look at the assembly code: cl stdcalltest.c and then dumpbin /disasm stdcalltest.obj > stdcalltest.txt.
Dump of file StdcallTest.obj
File Type: COFF OBJECT
_Function@12:
00000000: 55 push ebp
00000001: 8B EC mov ebp,esp
00000003: 8B 45 08 mov eax,dword ptr [ebp+8]
00000006: 03 45 0C add eax,dword ptr [ebp+0Ch]
00000009: 2B 45 10 sub eax,dword ptr [ebp+10h]
0000000C: 5D pop ebp
0000000D: C2 0C 00 ret 0Ch
_main:
00000010: 55 push ebp
00000011: 8B EC mov ebp,esp
00000013: 51 push ecx
00000014: 6A 03 push 3
00000016: 6A 02 push 2
00000018: 6A 01 push 1
0000001A: E8 00 00 00 00 call 0000001F
0000001F: 89 45 FC mov dword ptr [ebp-4],eax
00000022: 8B E5 mov esp,ebp
00000024: 5D pop ebp
00000025: C3
As seen from the assembly code, in stdcall, the stack is cleaned by the function called. I.e., ret 0Ch
in the last line of the function.
So if you are using a function which is called twenty times, the cleanup code will be placed only once in the function called, if __stdcall
is used. But if __cdecl
is used, it will be twenty times in the code, i.e., everywhere in main
after function is called, and if we have say fifty functions in a file each of which is called twenty times, then the size of the EXE in CDecl will be large. But then what is the advantage of __cdecl
... that's something that only C has and not even C++.
In __cdecl calling convention, you can pass variable number of arguments. Remember the ellipses (...). But it is not possible in __stdcall
which C++ uses. Let's see how. As seen in __stdcall
, the function cleanup code is placed only once in the function and the value is ret
no. of args passed. I.e., if three arguments are passed, then 0Ch, which is fixed so we cannot pass variable number of arguments to the function. Whereas in __cdecl
, the cleanup code is number of times the function is called, so it is possible to cleanup as it knows the number of arguments passed each time. Got it? Else refer to the above explanation and the assembly code. Also observe the function name mangling in assembly code in both cases. The function name is prefixed by _
. But in case __stdcall
contains @somevalue
, that value is nothing but the size of numbers of elements passed on the stack that has to be cleaned up before returning. So in our case, it is ...yes, you got it right.
For __cdecl
, call _Function
, and for __stdcall
, _Function@12
.
Anyways, one more example to clear up things.
Below is the code using __cdecl calling convention and it will compile:
#include "stdio.h"
int __cdecl Function(int a,...)
{
return 1;
}
void main()
{
int r = Function(1, 2, 3);
}
Below is the code using __stdcall calling convention and it will not compile:
#include "stdio.h"
int __stdcall Function(int a, ...)
{
return a;
}
void main()
{
int r = Function(1, 2, 3);
int x = Function(1, 2);
}
Now last but not the least, fast call. So, let's have a look at the assembly code:
#include "stdio.h"
int __fastcall Function(int a, int b, int c)
{
return a + b -c;
}
void main()
{
int r = Function(1, 2, 3);
}
Microsoft (R) COFF Binary File Dumper Version 6.00.8168
Copyright (C) Microsoft Corp 1992-1998. All rights reserved.
Dump of file FastCallTest.obj
File Type: COFF OBJECT
@Function@12:
00000000: 55 push ebp
00000001: 8B EC mov ebp,esp
00000003: 83 EC 08 sub esp,8
00000006: 89 55 F8 mov dword ptr [ebp-8],edx
00000009: 89 4D FC mov dword ptr [ebp-4],ecx
0000000C: 8B 45 FC mov eax,dword ptr [ebp-4]
0000000F: 03 45 F8 add eax,dword ptr [ebp-8]
00000012: 2B 45 08 sub eax,dword ptr [ebp+8]
00000015: 8B E5 mov esp,ebp
00000017: 5D pop ebp
00000018: C2 04 00 ret 4
_main:
0000001B: 55 push ebp
0000001C: 8B EC mov ebp,esp
0000001E: 51 push ecx
0000001F: 6A 03 push 3
00000021: BA 02 00 00 00 mov edx,2
00000026: B9 01 00 00 00 mov ecx,1
0000002B: E8 00 00 00 00 call 00000030
00000030: 89 45 FC mov dword ptr [ebp-4],eax
00000033: 8B E5 mov esp,ebp
00000035: 5D pop ebp
00000036: C3 ret
In the case of fast call, as seen from the assembly code of main
, just one variable is pushed and two are passed using registers. I.e., in the case of fast call, first and second arguments are passed through the register and rest as normal, i.e., on stack. Since registers are used for passing, it's much faster, but only a maximum of two arguments can be passed through registers. Also notice the name mangling, the function begins with @ and has number of arguments passed, i.e., in our case, @Function@12
.
A quick review
Calling convention |
Stack cleaning responsibility |
Name mangling |
Advantages |
__stdcall |
The called function (i.e., cleanup code only once) |
_FunctionName@4*argumentspassed |
small size of EXE |
__cdecl |
The calling function (each time function is called) |
_Function |
can pass variable number of arguments |
__fastcall |
The called function (i.e., cleanup code only once) |
@FunctionName@4*argumentspassed |
fast calling by the use of registers |
Note
I would like to thank Mr. Sameer Vasani, my team, and my friend Rahul Bhamre from whom I have learnt a lot and is still learning new stuff.