Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

GC Listeners, Soft References and Truly Weak Collections

0.00/5 (No votes)
19 Apr 2016 1  
Intercepting the Garbage Collector with Listeners that allow us to create C# Soft References and truly Weak Collections

Introduction

It is said that .NET/C# does not provide a way to intercept/hook/mess with the Garbage Collector. And indeed it does not, and for very good reasons. At the same time, C# does not provide what in the Java world is known as "Soft References": weak references that are not dismissed immediately after a GC cycle. These facts make it difficult to create truly weak collections, those whose elements are automatically removed and collected when not in use any longer.

This article firstly introduces the GCListener class, an abstract base class for objects that need to be informed either when a garbage collection has happened or, after a garbage collection, when a given expiration criteria is met. This class is the parent of the SoftReference one, a WeakReference replacement that permits to set the expiration criteria so, effectively, extending its time span programmatically. And, finally, using these soft references, this article also includes a number of SoftXXX collections (as SoftList, SoftDictionary, and so on) whose elements are kept as soft references so they are automatically removed and collected when not used for a given time or number of garbage collection cycles.

GCListener

The .NET's execution environment fires a garbage collector cycle each time it feels it is necessary. There are zillions of articles and tutorials out there about how it works, its generations-based mechanism, what to take into consideration to develop GC-friendly classes, and so on - so I'm not going to repeat all this information but the specific bits that we'll need for what follows.

One important thing to bear in mind is that those GC cycles are fired at arbitrary moments, and not only when there is a memory pressure. Actually, when you are executing code compiled in "Release" mode, then the GC can be very aggressive: in an effort to optimize its operation generation-0 objects are collected very often.

Now, looks like there is no way of being informed by the execution engine that a garbage collection is about to happen (except for generation-2) but yes, there is a way of being informed when it has happened - at least for a very concrete scenario. And the good news is that you are most probably using it already on a routine basis: I'm talking about the C# finalizers. As you surely know they are automatically invoked after a garbage collection for the objects that implement them. But, what's the point of being informed that we are destroyed... when we are actually destroyed and cannot do anything else?

What GCListener does is using a surrogate object, a temporary one that, when it is finalized, informs back the listener that "owns" it. Once the listener has processed the surrogate's notification, then it creates another temporary surrogate, and this cycle is repeated till the master object itself is finalized. To allow us to do something useful with those notifications, any class that inherits from GCListener needs to override the OnGCNotification() method.

Before continuing with the explanation, let me address some concerns you may have:

  • Yes, this approach comes with a small penalty (that you can sense using the performance tests I have also included in the download). But, at least for the scenarios I'm using, this class is a small one. Firstly, the temporary objects are extremely light ones, and they basically just carry a reference back to the owning instance. And, secondly, those temporary objects are collected quickly and efficiently - the GC is specifically built for doing precisely this thing!
  • The OnGCNotification() method is invoked only when these temporary surrogate objects are finalized, so, as said, after the garbage collector has happened on that instance - but maybe not yet on other listeners -, and from the high priority finalizer thread. Both things have to be taken into consideration when implementing classes that inherit from GCListener (which I guess you will seldom do if you use the other classes introduced below in this article).
  • Finally, to make our lives slightly more easy, the call to the notification method is protected by a light locking model: GCListener tries to lock on its SyncRoot object property but, if it is already locked by any other thread, then the notification is merely discarded (until the next cycle).

This SyncRoot property is a virtual one so you can use any object you wish in your derived classes, as far as it is a reference type (a class). For instance, the soft collections explained below in this article override this property to point to their own internal structures.

GCListener implements another improvement, which is the expiration criteria mechanism. The default parameterless constructor just sets the listener to invoke the notification method each time a garbage collection cycle has happened. As this might not be very effective in many circumstances, this class has two properties, GCCycles and GCTicks, that permit us to specify when we would like to receive those notifications:

  • GCCycles allows us to specify a number of, well, garbage collection cycles after which the notification is fired. You may want to recall that the moment when a garbage collection occurs is under control of the GC so the concrete time when this notification happens is completely indeterministic. If the value of this property is zero, then this criteria is not used.
  • GCTicks allows us to specify the minimum number of time ticks before a notification is fired. Note that this property is only checked along with a garbage collection occurrence so, again, the concrete moment when this criteria is considered to be met is also unpredictable. If the value of this property is zero, then this criteria is not used.

We'll see below some examples on how to use this properties.

Finally, just for completion, this class provides the GCCurrentPulses property that informs you back on how many GC occurrences has happened for this instance since it was instantiated. It is used internally but I have found useful to make it public for debugging purposes.

SoftReference

What's wrong with WeakReference? Well, actually nothing as far as you use it for what it has been designed. But we all have faced scenarios where we would have liked our weak references to survive beyond the next garbage collection. Java's SoftReferences actually are not collected immediately, but rather only when there is some kind of pressure - but we do not have the same luxury in C#.

So I have borrowed the name but not just copied Java's functionality: I want our new C#'s SoftReference to maintain the hard reference to the underlying object for, at least, the given time ticks or the given garbage collection cycles, or both, and, afterwards, if it is not used, let the GC do its magic to collect the object. But, if in the meanwhile, the object's reference is reclaimed for whatever purposes, then it grabs again the reference for another expiration criteria cycle.

Let's take a look at the following example:

var span = TimeSpan.FromSeconds(5);
var obj = new MyObject(...);
var soft = new SoftReference(obj) { GCTicks = span.Ticks };
obj = null;

Here, we are obtaining a soft reference to an object for which we are not keeping a hard reference anymore. But, in this case, we want its reference to survive for at least 5 seconds despite of how many garbage collection occurrences may happen in that period. We can easily test this fact in the following way:

var start = DateTime.Now; do
{
    GC.Collect();
    GC.WaitForPendingFinalizers();
}
while ((DateTime.Now.Ticks - start.Ticks) < span.Ticks);
Assert.IsTrue(soft.IsAlive);

Note that the above code would have never passed the test using WeakReference, as after a few garbage collections its Target property becomes null, and its IsAlive one becomes false. But because we have used SoftReference instead, we have achieved out objective.

As soon as the expiration criteria is met, the SoftReference will "forget" the hard reference it was maintaining to the underlying object and, from this moment onwards, rely only in an internal weak reference to it. If another garbage collection happens then, most probably, the object is collected and this weak reference becomes invalid. But, if in the meanwhile, the Target property is used then the hard reference is "refreshed" and a new complete cycle begins again.

If we want to test this fact for a number of cycles:

var obj = new MyObject();
var soft = new SoftReference(obj) { GCCycles = ... }; // Use your favorite number

var max = ...; // Your test value
for(int i = 0; i < max; i++)
{
    GC.Collect();
    GC.WaitForPendingFinalizers();
    
    Assert.IsTrue(soft.IsAlive);
    var target = soft.Target;
    Assert.IsNotNull(target);
}

All of this logic is handled, obviously, by the SoftReference class itself so, for all practical purposes, you can just use it as a WeakReference replacement but with the luxury of an extended and customizable live span.

If you are curious, or if you would need them for some quite convoluted scenarios, this class also provides the following properties: RawTarget, that provides access to the internal hard reference (they may be null if the SoftReference instance is just using the weak reference at this very moment); WeakTarget, that gives you back either the hard reference or the weak one, whatever is not null, but does not refresh the hard reference (as the Target one does); and WeakReference that, as its name suggest, gives you access to the actual weak reference maintained by this instance.

Soft Collections

There are probably out there as many articles talking about weak collections, weak dictionaries, weak caches, and the like, as times Steve Balmer said "developer" in his famous speech. Or maybe even more. Anyhow, those that tried to implement this feature using WeakReferences are, well, weak (pun intended) and full of nuances, and the few attempts that tried to use the new ConditionalWeakTable have found that, apart from it being amazing, it cannot really be used for the purposes we are looking at.

But, do not suffer any longer, download the source code (or grab it from GitHub) and start enjoying the following classes.

SoftList

This class provides an IList<T> implementation whose elements are kept, internally, as SoftReferences, and so they are automatically collected and removed from the list when they are not used any longer, and when the expiration criteria (if any is given) is met.

For simplicity reasons, the expiration criteria of the elements is copied from the one set on the owning list when these elements are added into the collection:

var list = new SoftList<MyObject>() { GCCycles = 5 };
list.Add(new MyObject());

In this example, after 5 garbage collection pulses, the object added into the collection expires because, as said, it copies its expiration criteria from the owning list. When this happens (and when its weak reference is no longer valid), it is then collected and the list automatically removes it.

Yes, you can use this new list the way you expect it to behave, but with the benefit that you don't have to remove from it invalid elements. For instance:

var obj = list[0];

This example gives you back the element at the given index. Actually it does more, because by obtaining the value this way it refreshes its reference so, indeed, extending its live span for another expiration cycle at least.

Caveats

Because the exact moment at when a given object is collected by the GC is unpredictable, there might be some small delay from when the element is collected to when it is removed - which happens when the list itself processes its own notification method. In these rare circumstances, if you access an element by index, you may retrieve null, which is considered by design an invalid value. Yes, none of the SoftXXX collections accept null as a valid value.

SoftDictionary

This class provides an IDictionary<TKey, TValue> implementation whose keys and values are kept internally as SoftReferences. If any member of the pair is collected, either the key or the value, that entry is considered invalid and then is automatically removed from the collection.

As happened before the expiration criteria of both the keys and values is copied from the one set into the owning dictionary.

An important consideration is that the internal entries are maintained using the key's hash code, and not the key itself. The SoftDictionary class provides constructors that allow you to specify your favorite custom IEqualityComparer<TKey> comparer whose GetHashCode() method is used to, well, obtain the key's hash code.

var comparer = new MyFavoriteComparer<TKey>();
var dict = new SoftDictionary<TKey, TValue>(comparer);

Interesting enough, I don't use this constructor often but, when I have needed, it can really be helpful.

This class being an IDictionary implementation it provides you with all the methods and properties you expect. So you can use its Add and Remove methods, its TryGetValue one, and so forth. Let's see for instance how it implements the ContainsKey one:

public bool ContainsKey(TKey key)
{
    if (key == null) throw new ArgumentNullException("key");
    lock (SyncRoot)
    {
        var hash = _KeysComparer.GetHashCode(key);
        Entry entry = null; if (_Dict.TryGetValue(hash, out entry))
        {
            if (!entry.SoftKey.IsAlive) return false;
            var target = (TKey)entry.SoftKey.Target; if (target == null) return false;
            if (!_KeysComparer.Equals(key, target)) return false;
            return true;
        }
        return false;
    }
}

So we can see that if the key is not found, then nothing is refreshed but, if it is found, then it is refreshed as it has been conceptually "used".

Finally, there are also other constructor overloads that permit you to specify a comparer for values. In this case, values are compared using its Equals(TValue a, TValue b) method instead of relying on the object's default ReferenceEquals one.

SoftDictionary variants

  • SoftKeyDictionary<TKey, TValue>, used when only the key shall be considered as a soft reference (or when the value's type is not a class).
  • SoftValueDictionary<TKey, TValue>, used when only the value shall be considered as a soft reference (or when the key's type is not a class). I have used this class when, for instance, the key is an int.

SoftBucketDictionary

And, finally, my personal favorite, the SoftBucketDictionary<TKey, TValue> class that I use as an in-memory cache. It basically permits you to associate to each key a bucket of an arbitrary number of values, where both the keys and the values are kept internally as soft references.

For technical reasons, it does not implement the IDictionary interface but provides similar functionality. Its most important properties and method are the following ones:

  • The Keys and Values properties permit to enumerate the valid elements in the collection. Actually, the later one enumerates the valid values across all buckets in it. The KeysCount and ValuesCount properties give you the count of their respective valid elements currently in the collection.
  • The FindList(TKey key) method returns the SoftList<TValue> instance that maintains the values associated with the given key.
  • The Remove(TKey key, TValue value) method removes the given value from the bucket associated with the given key, not from any other bucket in case it was added on them. If you want to remove all the occurrences of a given value use the RemoveAll(TValue value) method instead.
  • The Add(TKey key, TValue value) method adds the given value into the bucket associated with the given key, creating it if needed.

SoftBucketDictionary variants

Yes, as you may have expected, the library also includes the SoftKeyBucketDictionary<TKey, TValue> and the SoftValueBucketDictionary<TKey, TValue> classes that behave as their homologous SoftDictionary ones.

Epilogue and Miscellanea

  • As you may know, WeakReference is, in essence, an object-oriented wrapper around the GCHandle struct. I strongly encourage you to take a look at their source code as it is an invaluable source of information about the internals of the .NET framework.
  • No, at this moment, there is not a NuGet package and I have no intention to provide a separate one. The reason is because all the classes introduced in this article are part of the next version of the Kerosene.Tools library for which, yes, there is a NuGet package. If you take a look at the source code, you will notice the "Tools" folder that contains the bare minimum supporting classes borrowed from that main library.

History

  • Version 1.0.0 - April '16: Initial version

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here