Introduction
There are still a few common misconceptions about the .NET garbage collector (GC). The most common appears to be that eligibility for collection is somehow tied to object usage or to variable scope. In fact, the GC couldn't care less about either. All it cares about is whether objects are live or dead (i.e., unreachable). Runtime effects that appear to prove one or the other are simply due to different MSIL representations of the source code. But let's take it one step at a time.
"Proving" that the GC Does Not Care About Usage
Here is a little program which is a variation of a common example that supposedly proves that the GC analyses variable usage. Or not. The program is simple enough. It starts a timer with a delay and an interval of 1 second, waits for the user to press a key before forcing a collection and then waits for a second key press to exit.
public static void Main(string[] args)
{
Timer timer = new Timer(state => Console.WriteLine("Tick..."), null, 1000, 1000);
Console.WriteLine("Press any key to force a garbage collection.");
Console.ReadKey(true);
GC.Collect();
Console.WriteLine("Press any key to exit.");
Console.ReadKey(true);
Console.WriteLine("Good bye!");
}
If the GC considered usage, it would be free to collect the timer after the first line of the method body and would certainly collect it when forcing the collection. For the first experiment, I ran the program using an unoptimized debug build (you'll see later why that matters). And this is the result:
Press any key to force a garbage collection.
Tick...
Tick...
Press any key to exit.
Tick...
Tick...
Good bye!
So we've proven that the GC does not consider usage. Or have we?
"Proving" that the GC Does Care about Usage
Now let's rerun the program, but this time using an optimized release build.
Press any key to force a garbage collection.
Tick...
Tick...
Press any key to exit.
Good bye!
This appears to prove that the GC does consider usage. But both statements cannot be true at the same time. In fact, they are both false. Again, the GC collects unreachable objects and only those. The obvious question then is why the GC collects a seemingly reachable object but only when we compile with code optimization turned on.
Solving the Mystery
First of all, let's remember that variable scope as it is defined in high-level programming languages such as C# doesn't translate directly to the compiled code. Of course, in MSIL, a method is a scope and so are the functions in the native code representation but these scopes don't have to match. Consider this little nonsensical method:
public static void Scope()
{
{
int i = 0;
}
{
int i = 0;
}
}
In C#, these are two distinct variables, both called i
but each in their own scope. The compiler backend though couldn't care less. The first scope requires a local of the type int
which, in the C# source, doesn't exist anymore after we exit the first scope. The actual local in MSIL of course still does which means it is available for reuse for the integer i
in the second scope. So the compiler emits only one local:
.method public hidebysig static void Scope() cil managed
{
.maxstack 1
.locals init ([0] int32 i)
IL_0000: nop
IL_0001: nop
IL_0002: ldc.i4.0
IL_0003: stloc.0
IL_0004: nop
IL_0005: nop
IL_0006: ldc.i4.0
IL_0007: stloc.0
IL_0008: nop
IL_0009: ret
}
Scopes in the compiled native code might differ even further due to optimizations such as inlining. More relevant to the issue at hand though is that just because we have a local in the C# code doesn't mean that there necessarily has to be one in the compiled MSIL. The timer program reduced to the first line looks like this when compiled without optimization.
.method public hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 4
.locals init ([0] class [mscorlib]System.Threading.Timer timer)
IL_0000: nop
IL_0001: ldsfld class [mscorlib]System.Threading.TimerCallback GCTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
IL_0006: brtrue.s IL_001b
IL_0008: ldnull
IL_0009: ldftn void GCTest.Program::'<Main>b__0'(object)
IL_000f: newobj instance void [mscorlib]System.Threading.TimerCallback::.ctor(object,
native int)
IL_0014: stsfld class [mscorlib]System.Threading.TimerCallback GCTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
IL_0019: br.s IL_001b
IL_001b: ldsfld class [mscorlib]System.Threading.TimerCallback GCTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
IL_0020: ldnull
IL_0021: ldc.i4 0x3e8
IL_0026: ldc.i4 0x3e8
IL_002b: newobj instance void [mscorlib]System.Threading.Timer::.ctor(class [mscorlib]System.Threading.TimerCallback,
object,
int32,
int32)
IL_0030: stloc.0
IL_0031: ret
}
And it makes sense. We declared a local, so there is a local in the MSIL. We create a new instance and store the reference in the local, so does the MSIL. This drastically changes once we turn optimization on and the compiler outsmarts us. It says: "Well, you wanna write a reference to a local but you're not going to read it back. Ever. Know what, I'm just gonna omit it!"
.method public hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 8
IL_0000: ldsfld class [mscorlib]System.Threading.TimerCallback GCTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
IL_0005: brtrue.s IL_0018
IL_0007: ldnull
IL_0008: ldftn void GCTest.Program::'<Main>b__0'(object)
IL_000e: newobj instance void [mscorlib]System.Threading.TimerCallback::.ctor(object,
native int)
IL_0013: stsfld class [mscorlib]System.Threading.TimerCallback GCTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
IL_0018: ldsfld class [mscorlib]System.Threading.TimerCallback GCTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
IL_001d: ldnull
IL_001e: ldc.i4 0x3e8
IL_0023: ldc.i4 0x3e8
IL_0028: newobj instance void [mscorlib]System.Threading.Timer::.ctor(class [mscorlib]System.Threading.TimerCallback,
object,
int32,
int32)
IL_002d: pop
IL_002e: ret
}
That's right, the optimized MSIL doesn't have a local which is the actual explanation why the timer gets collected in release builds but not in debug builds. In release builds, no reference is kept on the stack or elsewhere. Yes, the object is "live" in that it's doing something. But to the GC, it is dead because it is unreachable; there is no way for any caller to get to it. So it's fine to collect it. It's as simple as that.
History
- 2013-08-10: Changed title and added tags
- 2013-08-09: Initial release