Introduction
It is said that .NET/C# does not provide a way to intercept/hook/mess with the Garbage Collector. And indeed it does not, and for very good reasons. At the same time, C# does not provide what in the Java world is known as "Soft References": weak references that are not dismissed immediately after a GC cycle. These facts make it difficult to create truly weak collections, those whose elements are automatically removed and collected when not in use any longer.
This article firstly introduces the GCListener
class, an abstract
base class for objects that need to be informed either when a garbage collection has happened or, after a garbage collection, when a given expiration criteria is met. This class is the parent of the SoftReference
one, a WeakReference
replacement that permits to set the expiration criteria so, effectively, extending its time span programmatically. And, finally, using these soft references, this article also includes a number of SoftXXX
collections (as SoftList
, SoftDictionary
, and so on) whose elements are kept as soft references so they are automatically removed and collected when not used for a given time or number of garbage collection cycles.
GCListener
The .NET's execution environment fires a garbage collector cycle each time it feels it is necessary. There are zillions of articles and tutorials out there about how it works, its generations-based mechanism, what to take into consideration to develop GC-friendly classes, and so on - so I'm not going to repeat all this information but the specific bits that we'll need for what follows.
One important thing to bear in mind is that those GC cycles are fired at arbitrary moments, and not only when there is a memory pressure. Actually, when you are executing code compiled in "Release" mode, then the GC can be very aggressive: in an effort to optimize its operation generation-0 objects are collected very often.
Now, looks like there is no way of being informed by the execution engine that a garbage collection is about to happen (except for generation-2) but yes, there is a way of being informed when it has happened - at least for a very concrete scenario. And the good news is that you are most probably using it already on a routine basis: I'm talking about the C# finalizers. As you surely know they are automatically invoked after a garbage collection for the objects that implement them. But, what's the point of being informed that we are destroyed... when we are actually destroyed and cannot do anything else?
What GCListener
does is using a surrogate object, a temporary one that, when it is finalized, informs back the listener that "owns" it. Once the listener has processed the surrogate's notification, then it creates another temporary surrogate, and this cycle is repeated till the master object itself is finalized. To allow us to do something useful with those notifications, any class that inherits from GCListener
needs to override the OnGCNotification()
method.
Before continuing with the explanation, let me address some concerns you may have:
- Yes, this approach comes with a small penalty (that you can sense using the performance tests I have also included in the download). But, at least for the scenarios I'm using, this class is a small one. Firstly, the temporary objects are extremely light ones, and they basically just carry a reference back to the owning instance. And, secondly, those temporary objects are collected quickly and efficiently - the GC is specifically built for doing precisely this thing!
- The
OnGCNotification()
method is invoked only when these temporary surrogate objects are finalized, so, as said, after the garbage collector has happened on that instance - but maybe not yet on other listeners -, and from the high priority finalizer thread. Both things have to be taken into consideration when implementing classes that inherit from GCListener
(which I guess you will seldom do if you use the other classes introduced below in this article).
- Finally, to make our lives slightly more easy, the call to the notification method is protected by a light locking model:
GCListener
tries to lock on its SyncRoot
object property but, if it is already locked by any other thread, then the notification is merely discarded (until the next cycle).
This SyncRoot
property is a virtual
one so you can use any object you wish in your derived classes, as far as it is a reference type (a class). For instance, the soft collections explained below in this article override this property to point to their own internal structures.
GCListener
implements another improvement, which is the expiration criteria mechanism. The default parameterless constructor just sets the listener to invoke the notification method each time a garbage collection cycle has happened. As this might not be very effective in many circumstances, this class has two properties, GCCycles
and GCTicks
, that permit us to specify when we would like to receive those notifications:
GCCycles
allows us to specify a number of, well, garbage collection cycles after which the notification is fired. You may want to recall that the moment when a garbage collection occurs is under control of the GC so the concrete time when this notification happens is completely indeterministic. If the value of this property is zero, then this criteria is not used.
GCTicks
allows us to specify the minimum number of time ticks before a notification is fired. Note that this property is only checked along with a garbage collection occurrence so, again, the concrete moment when this criteria is considered to be met is also unpredictable. If the value of this property is zero, then this criteria is not used.
We'll see below some examples on how to use this properties.
Finally, just for completion, this class provides the GCCurrentPulses
property that informs you back on how many GC occurrences has happened for this instance since it was instantiated. It is used internally but I have found useful to make it public for debugging purposes.
SoftReference
What's wrong with WeakReference
? Well, actually nothing as far as you use it for what it has been designed. But we all have faced scenarios where we would have liked our weak references to survive beyond the next garbage collection. Java's SoftReference
s actually are not collected immediately, but rather only when there is some kind of pressure - but we do not have the same luxury in C#.
So I have borrowed the name but not just copied Java's functionality: I want our new C#'s SoftReference
to maintain the hard reference to the underlying object for, at least, the given time ticks or the given garbage collection cycles, or both, and, afterwards, if it is not used, let the GC do its magic to collect the object. But, if in the meanwhile, the object's reference is reclaimed for whatever purposes, then it grabs again the reference for another expiration criteria cycle.
Let's take a look at the following example:
var span = TimeSpan.FromSeconds(5);
var obj = new MyObject(...);
var soft = new SoftReference(obj) { GCTicks = span.Ticks };
obj = null;
Here, we are obtaining a soft reference to an object for which we are not keeping a hard reference anymore. But, in this case, we want its reference to survive for at least 5 seconds despite of how many garbage collection occurrences may happen in that period. We can easily test this fact in the following way:
var start = DateTime.Now; do
{
GC.Collect();
GC.WaitForPendingFinalizers();
}
while ((DateTime.Now.Ticks - start.Ticks) < span.Ticks);
Assert.IsTrue(soft.IsAlive);
Note that the above code would have never passed the test using WeakReference
, as after a few garbage collections its Target
property becomes null
, and its IsAlive one becomes false
. But because we have used SoftReference
instead, we have achieved out objective.
As soon as the expiration criteria is met, the SoftReference
will "forget" the hard reference it was maintaining to the underlying object and, from this moment onwards, rely only in an internal weak reference to it. If another garbage collection happens then, most probably, the object is collected and this weak reference becomes invalid. But, if in the meanwhile, the Target
property is used then the hard reference is "refreshed" and a new complete cycle begins again.
If we want to test this fact for a number of cycles:
var obj = new MyObject();
var soft = new SoftReference(obj) { GCCycles = ... };
var max = ...; for(int i = 0; i < max; i++)
{
GC.Collect();
GC.WaitForPendingFinalizers();
Assert.IsTrue(soft.IsAlive);
var target = soft.Target;
Assert.IsNotNull(target);
}
All of this logic is handled, obviously, by the SoftReference
class itself so, for all practical purposes, you can just use it as a WeakReference
replacement but with the luxury of an extended and customizable live span.
If you are curious, or if you would need them for some quite convoluted scenarios, this class also provides the following properties: RawTarget
, that provides access to the internal hard reference (they may be null
if the SoftReference
instance is just using the weak reference at this very moment); WeakTarget
, that gives you back either the hard reference or the weak one, whatever is not null, but does not refresh the hard reference (as the Target
one does); and WeakReference
that, as its name suggest, gives you access to the actual weak reference maintained by this instance.
Soft Collections
There are probably out there as many articles talking about weak collections, weak dictionaries, weak caches, and the like, as times Steve Balmer said "developer" in his famous speech. Or maybe even more. Anyhow, those that tried to implement this feature using WeakReference
s are, well, weak (pun intended) and full of nuances, and the few attempts that tried to use the new ConditionalWeakTable
have found that, apart from it being amazing, it cannot really be used for the purposes we are looking at.
But, do not suffer any longer, download the source code (or grab it from GitHub) and start enjoying the following classes.
SoftList
This class provides an IList<T>
implementation whose elements are kept, internally, as SoftReference
s, and so they are automatically collected and removed from the list when they are not used any longer, and when the expiration criteria (if any is given) is met.
For simplicity reasons, the expiration criteria of the elements is copied from the one set on the owning list when these elements are added into the collection:
var list = new SoftList<MyObject>() { GCCycles = 5 };
list.Add(new MyObject());
In this example, after 5 garbage collection pulses, the object added into the collection expires because, as said, it copies its expiration criteria from the owning list. When this happens (and when its weak reference is no longer valid), it is then collected and the list automatically removes it.
Yes, you can use this new list the way you expect it to behave, but with the benefit that you don't have to remove from it invalid elements. For instance:
var obj = list[0];
This example gives you back the element at the given index. Actually it does more, because by obtaining the value this way it refreshes its reference so, indeed, extending its live span for another expiration cycle at least.
Caveats
Because the exact moment at when a given object is collected by the GC is unpredictable, there might be some small delay from when the element is collected to when it is removed - which happens when the list itself processes its own notification method. In these rare circumstances, if you access an element by index, you may retrieve null
, which is considered by design an invalid value. Yes, none of the SoftXXX
collections accept null
as a valid value.
SoftDictionary
This class provides an IDictionary<TKey, TValue>
implementation whose keys and values are kept internally as SoftReference
s. If any member of the pair is collected, either the key or the value, that entry is considered invalid and then is automatically removed from the collection.
As happened before the expiration criteria of both the keys and values is copied from the one set into the owning dictionary.
An important consideration is that the internal entries are maintained using the key's hash code, and not the key itself. The SoftDictionary
class provides constructors that allow you to specify your favorite custom IEqualityComparer<TKey>
comparer whose GetHashCode()
method is used to, well, obtain the key's hash code.
var comparer = new MyFavoriteComparer<TKey>();
var dict = new SoftDictionary<TKey, TValue>(comparer);
Interesting enough, I don't use this constructor often but, when I have needed, it can really be helpful.
This class being an IDictionary
implementation it provides you with all the methods and properties you expect. So you can use its Add
and Remove
methods, its TryGetValue
one, and so forth. Let's see for instance how it implements the ContainsKey
one:
public bool ContainsKey(TKey key)
{
if (key == null) throw new ArgumentNullException("key");
lock (SyncRoot)
{
var hash = _KeysComparer.GetHashCode(key);
Entry entry = null; if (_Dict.TryGetValue(hash, out entry))
{
if (!entry.SoftKey.IsAlive) return false;
var target = (TKey)entry.SoftKey.Target; if (target == null) return false;
if (!_KeysComparer.Equals(key, target)) return false;
return true;
}
return false;
}
}
So we can see that if the key is not found, then nothing is refreshed but, if it is found, then it is refreshed as it has been conceptually "used".
Finally, there are also other constructor overloads that permit you to specify a comparer for values. In this case, values are compared using its Equals(TValue a, TValue b)
method instead of relying on the object
's default ReferenceEquals
one.
SoftDictionary variants
SoftKeyDictionary<TKey, TValue>
, used when only the key shall be considered as a soft reference (or when the value's type is not a class).
SoftValueDictionary<TKey, TValue>
, used when only the value shall be considered as a soft reference (or when the key's type is not a class). I have used this class when, for instance, the key is an int
.
SoftBucketDictionary
And, finally, my personal favorite, the SoftBucketDictionary<TKey, TValue>
class that I use as an in-memory cache. It basically permits you to associate to each key a bucket of an arbitrary number of values, where both the keys and the values are kept internally as soft references.
For technical reasons, it does not implement the IDictionary
interface but provides similar functionality. Its most important properties and method are the following ones:
- The
Keys
and Values
properties permit to enumerate the valid elements in the collection. Actually, the later one enumerates the valid values across all buckets in it. The KeysCount
and ValuesCount
properties give you the count of their respective valid elements currently in the collection.
- The
FindList(TKey key)
method returns the SoftList<TValue>
instance that maintains the values associated with the given key.
- The
Remove(TKey key, TValue value)
method removes the given value from the bucket associated with the given key, not from any other bucket in case it was added on them. If you want to remove all the occurrences of a given value use the RemoveAll(TValue value)
method instead.
- The
Add(TKey key, TValue value)
method adds the given value into the bucket associated with the given key, creating it if needed.
SoftBucketDictionary variants
Yes, as you may have expected, the library also includes the SoftKeyBucketDictionary<TKey, TValue>
and the SoftValueBucketDictionary<TKey, TValue>
classes that behave as their homologous SoftDictionary
ones.
Epilogue and Miscellanea
- As you may know,
WeakReference
is, in essence, an object-oriented wrapper around the GCHandle
struct. I strongly encourage you to take a look at their source code as it is an invaluable source of information about the internals of the .NET framework.
- No, at this moment, there is not a NuGet package and I have no intention to provide a separate one. The reason is because all the classes introduced in this article are part of the next version of the
Kerosene.Tools
library for which, yes, there is a NuGet package. If you take a look at the source code, you will notice the "Tools" folder that contains the bare minimum supporting classes borrowed from that main library.
History
- Version 1.0.0 - April '16: Initial version