Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / threads

Synchronizing ConcurrentDictionary

5.00/5 (1 vote)
15 Feb 2013CPOL3 min read 18.1K  
ConcurrentDictionary's methods can call your value factory more than once.When is this a problem? How can it be overcome?

Introduction 

ConcurrentDictionary solves many problems when it comes to performant synchronization. But it has one hangup. It uses an optimistic means of concurrency where value factories may be executed more than once. Let's explore the options for resolving this...

Typical Usage  

C#
var dictionary = new ConcurrentDictionary<string, MyClass>();
Parallel.ForEach(source, dataitem =>{
  var key = dataitem.Category;
  var myObject = dictionary.GetOrAdd(key, k=>new MyClass());
  /* Do some stuff with myObject */
});

The above sample is a possible application of using a ConcurrentDictionary. Basically, you may need to auto populate a dictionary on the fly with objects that need work done on them. This is a bit over simplified, but I just wanted to outline the pattern first.

The ConcurrentDictionary will generate a MyClass object if it doesn't exist in the collection and the Parallel.ForEach may request for the same key simultaneously. If by chance the entry does not exist, and two threads ask for the same key, ConcurrentDictionary executes both constructors simultaneously and registers one of them. The typical case where the constructor is lightweight and takes no time to construct, this is a very efficient non-blocking means of synchronization. Constructing more than one doesn't matter in this case because it still only returns the same registered instance to the calling threads. This works great until you are generating an object that is IDisposable...

The Non-Typical... 

So what if your value-factory actually does some important work?

C#
ConcurrentDictionary<object, object> c = /* Some static registry */;

object writeLock = null;
object lockEntry = null;

lockEntry = c.GetOrAdd(key, k =>
{
    writeLock = new Object();
    // First to get the lock owns it...
    Monitor.Enter(writeLock);

    return writeLock;
});

In this sample we are trying to do is allow for the first request of an object to have a lock and any subsequent requests are waiting in line. If writeLock is not null, then you know you have the lock, and if null, you know you don't. The problem with this is you end up potentially creating more than one writeLock and aren't actually getting what you needed.

The Quick Fix

The way to compensate for creating more than one object and passing it around is comparing the two afterwards. Checking if writeLock is not null, then checking if it does't match lockEntry means that it's not the true lock object.

if (writeLock != null && lockEntry != writeLock)
{
    // This means that the concurrent dictionary executed GetOrAdd more than once (concurrently).
    // writeLock is then a local object that will not be accessed from other threads.
    // Therefore we do not own the lock exclusively..
    Monitor.Exit(writeLock);
    // Is this object IDisposable?  Call .Dispose() here.
    writeLock = null;
} 

If the above code is appended to the previous sample, the problem is resolved. You ensure only one writeLock object exists even though the non-blocking GetOrAdd created two write locks.

This pattern also allows for calling Dispose() on IDisposable objects that end up being created but not used.

What if I can't afford it happen more than once?

What about the case where your value factory is long running and you need to be certain it runs only once. The answer is ConcurrentDictionary<TKey,Lazy<TValue>>.

C#
var dictionary = new ConcurrentDictionary<string, Lazy<MyClass>>();
Parallel.ForEach(source, dataitem =>{
  var key = dataitem.Category;
  var myObject = dictionary.GetOrAdd(key, k=>new Lazy<MyClass>(()=>{
    /* Some potentially long running code */
  },LazyThreadSafetyMode.ExecutionAndPublication)).Value;
  /* Do some stuff with myObject */
});

The result is that you only execute your value-factory only once. And for this specific usage, this is very performant and an optimal usage. Only one Lazy<MyClass> object is returned even though more than once is generated and the Value property is internally synchronized within the Lazy<MyClass>. The trouble you might have with this method is if you need to remove an entry by it's value. You would have to iterate through the colleciton one by one to find the key for that object and then remove it by key.

Conclusion   

Personally, I love ConcurrentDictionary and use it quite often. But once in a while, I have to deal with this type of synchronization issue. At the moment, there isn't a built in SynchronizedDictionary class in the .NET framework that would ensure only one execution. But these two methods leverage existing non-blocking performance of ConcurrentDictionary and resolve this problem.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)