Garbage collection is often seen as a kind of black magic in the world of .NET, particularly by junior programmers. There is no real need to understand how it works in order to build most applications, therefore it remains a mystery until a time comes when you think you need to know how it works, or else you decide that it might be interesting to learn about. I remember early in my career being part of a team faced with a performance problem which was due to a memory leak. We decided to respond by littering the code with GC.Collect
calls to see if that helped. Recalling this makes me wince. It is a classic example of “hit and hope”, when really we should have taken more time to try and understand and diagnose the problem. At the very least, having a basic understanding of garbage collection may prevent you from attempting to solve a performance problem in this way, without understanding what that call to GC.Collect
is actually doing.
What is Garbage Collection in .NET?
Garbage collection is the automatic process of freeing up (or deallocating) memory which is being taken up by objects your application no longer needs. Most developers know this much at least.
Every time you instantiate an object in .NET, some memory is allocated to store that object. But at some point that object may no longer be needed by your application. Only by deallocating its memory does it become available again for your application to reuse. If memory is not deallocated, then sooner or later your application will run out of memory and stop working.
So Garbage Collection is a Must?
No, garbage collection is just one approach to the problem of how to deallocate memory. In some languages, such as C and C++, it is the responsibility of the programmer to keep track of which objects are no longer needed, and to explicitly deallocate memory as required.
How Does Garbage Collection Work?
Inside the CLR lives a ‘garbage collector’. Unsurprisingly, it is the responsibility of the garbage collector to manage garbage collection. It does this by periodically performing ‘garbage collections’. Every time a garbage collection is performed, the garbage collector basically looks at the memory (i.e. the managed heap for your application) and frees up memory which is no longer needed, that is, memory which is occupied by ‘dead objects’.
How Does It Know When An Object is ‘dead’?
An object is dead if it is unreachable by your code. The obvious example is a local variable inside a method. There is no way for your code to access that variable once the method has returned, so it becomes ‘dead’.
How Often Does the Garbage Collector Perform a Garbage Collection?
There are three ways a garbage collection can be triggered.
Firstly, if your system has low physical memory, this can trigger a garbage collection.
Secondly, a threshold is defined which indicates an acceptable level of memory on the heap which can be used by allocated objects. If this threshold is surpassed, then a garbage collection is triggered.
Finally, a garbage collection can explicitly be triggered by calling the GC.Collect
method. Only under very rare circumstances is this ever required.
So ignoring the case of calling GC.Collect, a garbage collection is basically triggered when the garbage collector figures that it might be useful to free up some memory?
Yes.
So whenever a garbage collection is triggered, it frees up all memory on the heap that is occupied by dead objects?
No, because scanning the entire managed heap for dead objects might take a long time, and thus affect performance. Every time a garbage collection is triggered, execution on all other threads is paused until it completes. So the garbage collector tries to be efficient in the way it looks for and deallocates dead objects. It is selective.
How Is It Selective?
Basically, every object on the heap is categorized into one of three ‘generations’. These are called ‘Generation 0’, ‘Generation 1’ and ‘Generation 2’. The generation of an object indicates how ‘old’ it is, that is, how long it has been since it was created. Broadly speaking, Generation 0 is for younger objects, Generation 1 is for middle-aged objects, and Generation 2 is for older objects.
When a garbage collection occurs, it does so on a specific generation, and deallocates any dead objects on that generation and on all younger generations. So a collection on Generation 1 will deallocate dead objects in Generations 0 and 1. The only time that all dead objects on the heap are deallocated is if a collection is performed on Generation 2.
Garbage collection is generally performed on Generation 0, that is, on short-lived objects. This is based on the reasonable assumption that the objects that are most likely to be dead are the ones which have most recently been allocated.
If an object ‘survives’ a garbage collection, it is ‘promoted’ to the next generation. So Generation 0 objects which are still alive at the time of a garbage collection are promoted to Generation 1. The assumption here is that if it is still alive after this collection, there is a good chance that it will still be alive after the next one, so we will move it out of our ‘top priority’ Generation 0 collection.
Presumably New Objects Are Allocated into Generation 0 Then?
When a new object is allocated, it goes into Generation 0, with one exception. Large objects (larger than 85,000 bytes) go straight into Generation 2. This decision is based on an assumption that large objects are more likely to be long-lived.
…and that’s pretty much it as far as the basics of garbage collection go.
As we can see, the garbage collector makes a few assumptions about your application to help it decide how to behave. Only when these assumptions turn out to be inappropriate for your application do you need to consider the possibility of manipulating it. For example, you can configure the ‘intrusiveness’ of the garbage collector (how often it is triggered), or explicitly trigger a collection on a specific generation.
The fact that many developers never feel the need to understand how garbage collection works is perhaps an indication that it does its job quite well. The garbage collector does the work so you don’t have to.
The post A Beginner’s Guide to Garbage Collection in .NET appeared first on The Proactive Programmer.