Introduction
As you possibly know, function (method) inlining is an optimization technique performed by almost all modern language compilers. In essence, method inlining is when the compiler decides to replace your function call with the body of the function. The compiler does this to save the overhead of actually making a function call, which would involve pushing each parameter on to the stack, function prologue, function epilogue, etc. .NET is not an exception. But bear in mind that it is not C# or VB.NET compiler but JIT compiler who is responsible for making decisions whether to inline method or not. Thus, let's look at what affects its decision.
To Inline Or Not To Inline
According to Sasha's Goldstein article http://blogs.microsoft.co.il/sasha/2012/01/20/aggressive-inlining-in-the-clr-45-jit/, the JIT won’t inline:
- Methods marked with
MethodImplOptions.NoInlining
- Methods larger than 32 bytes of IL
- Virtual methods
- Methods that take a large value type as a parameter
- Methods on
MarshalByRef
classes
- Methods with complicated flowgraphs
- Methods meeting other, more exotic features
But it turns out that the same method could be inlined in one context and remains not inlined in another. Let's consider legacy JIT-x86 and create a simple method:
public static class Utils
{
public static bool TryParse(char c, out int val)
{
if (c < '0' || c > '9')
{
val = 0;
return false;
}
val = (c - '0');
return true;
}
}
This method just checks whether a character is a digit or not and returns converted character as an output parameter. If we try to invoke this method:
var key = System.Console.ReadKey();
int val = 0;
if (Utils.TryParse(key.KeyChar, out val))
{
Console.WriteLine("Parsed");
}
and investigate JIT-compiled assembly code, we'll see the following code fragment (you can use Visual Studio or WinDBG
to delve into jitted machine code).
0000001e xor edx,edx
00000020 mov dword ptr [ebp-10h],edx
00000023 movzx ecx,word ptr [ebp-0Ch]
00000027 lea edx,[ebp-10h]
0000002a call dword ptr ds:[00944D4Ch]
00000030 test eax,eax
00000032 je 0000003F
We see call instruction at 0000002a
address. That is method TryParse
was not inlined. It is likely due to output parameter presence.
For Loops
Now, let's modify method's invocation code so that TryParse
method is called from inside a for
loop.
var key = System.Console.ReadKey();
int val = 0;
for (int i = 0; i < 1; i++)
{
if (Utils.TryParse(key.KeyChar, out val))
{
Console.WriteLine("Parsed");
}
}
And here is the assembly code generated from C# snippet above:
00000024 xor esi,esi
00000026 movzx eax,word ptr [ebp-10h]
0000002a cmp eax,30h
0000002d jl 00000034
0000002f cmp eax,39h
00000032 jle 0000003D
00000034 xor edx,edx
00000036 mov dword ptr [ebp-14h],edx
00000039 xor eax,eax
0000003b jmp 00000048
0000003d add eax,0FFFFFFD0h
00000040 mov dword ptr [ebp-14h],eax
00000043 mov eax,1
00000048 test eax,eax
0000004a je 00000057
It is obvious that in that case the method was inlined. I highlighted comparison and jump instructions which correspond to if
statement below:
if (c < '0' || c > '9')
The conclusion is that when method is called inside a for
loop, it is more likely that JIT will inline it. But what about other kinds of loops?
While Loops
Let's look at while
loop example:
var key = System.Console.ReadKey();
int val = 0;
int i = 0;
while (i < 1)
{
i++;
if (Utils.TryParse(key.KeyChar, out val))
{
Console.WriteLine("Parsed");
}
}
You can try to investigate jitted assembly code yourself and make sure that while
loop behaves the same way as for
loop, that is JIT inlines TryParse
method in that case.
Endless Loops
But not all loops behave the same way as inlining is concerned. Let's look at functionally equivalent endless for loop that breaks after its first iteration:
var key = System.Console.ReadKey();
int val = 0;
for (;;)
{
if (Utils.TryParse(key.KeyChar, out val))
{
Console.WriteLine("Parsed");
}
break;
}
Magically, in that case JIT decides not to inline TryParse
method in spite of the fact, that this loop is exactly the same as the previous ones from a logical point of view. What about foreach
loop?
foreach loop
List<int> l = new List<int> { 0 };
int val = 0;
foreach (var i in l)
{
var key = System.Console.ReadKey();
if (Utils.TryParse(key.KeyChar, out val))
{
Console.WriteLine("Parsed");
}
}
JIT emerged pretty complicated assembly code as you can see from the code snippet below:
s00000039 lea edi,[ebp-4Ch]
0000003c xorps xmm0,xmm0
0000003f movq mmword ptr [edi],xmm0
00000043 movq mmword ptr [edi+8],xmm0
00000048 mov dword ptr [ebp-4Ch],esi
0000004b mov dword ptr [ebp-48h],edx
0000004e mov eax,dword ptr [esi+10h]
00000051 mov dword ptr [ebp-44h],eax
00000054 mov dword ptr [ebp-40h],edx
00000057 lea edi,[ebp-3Ch]
0000005a lea esi,[ebp-4Ch]
0000005d movs dword ptr es:[edi],dword ptr [esi]
0000005e movs dword ptr es:[edi],dword ptr [esi]
0000005f movs dword ptr es:[edi],dword ptr [esi]
00000060 movs dword ptr es:[edi],dword ptr [esi]
00000061 lea ecx,[ebp-3Ch]
00000064 call 70C2F368
00000069 test eax,eax
0000006b je 000000B4
0000006d lea ecx,[ebp-2Ch]
00000070 xor edx,edx
00000072 call 713461AC
00000077 movzx eax,word ptr [ebp-2Ch]
0000007b cmp eax,30h
0000007e jl 00000085
00000080 cmp eax,39h
00000083 jle 0000008E
But it is simple to find cmp
and jump
(jl
and jle
) instructions which correspond to already familiar if
statement from TryParse
method. That is, foreach
loop increases the possibility that JIT will inline method called inside it.
ForEach method
There is a ForEach
method of Lst<T>
class which is functionally equivalent as foreach
loop. But if we replace foreach
with ForEach
method:
int val;
List<int> l = new List<int> { 0 };
l.ForEach((i) =>
{
var key = System.Console.ReadKey();
if (Utils.TryParse(key.KeyChar, out val))
{
Console.WriteLine("Parsed");
}
});
no inlining will occur.
Conclusion
Despite the fact that invoking a method from inside a loop increases the possibility that JIT will inline a method, not all kind of loops and loop-like constructs behave the same way. We covered only x32 version of JIT and you may ask about x64 JIT if its behaviour differes from x32 version. Good question. I asked the same question myself. And the answer is that x64 JIT hehaves exactly the same way. The following table summarizes what we have seen above.
Method invocation |
x32 JIT |
x64 JIT |
For loop |
Inlined |
Inlined |
While loop |
Inlined |
Inlined |
Endless loop |
Not inlined |
Not inlined |
Foreach loop |
Inlined |
Inlined |
ForEach method |
Not inlined |
Not inlined |
It may be possible that new RyuJIT
behaves slightly different. You can examine it yourself if you are interested and share your findings.
History
- Version 1 - January 2016
- Version 2 - January 2016. x64 JIT version was also covered