In relation to my post about the MultiKeyDictionary
object that I created (you can read about it on CodeProject here, or on my blog here), I created some performance tests to illustrate the difference between my method of creating this class, vs. some alternatives.
For instance, you could use Tuple<int, string, string>
, or even Dictionary<int, KeyValuePair<string, string>>
as the underlying dictionary
, iterating over the collections using LINQ. The functionality would be there, no doubt about that.
Unfortunately, when you use LINQ for much of anything, the performance immediately goes downhill. But just to make sure that I wasn't crazy, before I started writing this blog entry, I put together a performance comparison using the aforementioned objects for storage - of course, benchmarking them against my MultiKeyDictionary
.
The results are pretty cut & dry - the MultiKeyGenericDictionary
outperforms the other two methods by a factor of about 1000x for the Dictionary<int, KeyValuePair<string, string>
, and is about 400x faster than the Tuple<int, string, string>
. Also, once you factor in synchronizing massively multi-threaded access to the underlying objects, you quickly find out that the comparison between the methods are not even close.
Take a look at these code snippets on the access to the underlying objects - keep in mind that LINQ is incredibly expensive, and doesn't parallelize well when you have multiple threads running multiple LINQ queries on the same object, all trying to parallelize their queries. In fact, you're better off not using the AsParallel()
extension.
This is the equivalent of the MultiKeyDictionary Associate
method:
genericDictionaryLock.EnterUpgradeableReadLock();
try
{
if (genericDictionary.ContainsKey(i))
{
try
{
genericDictionaryLock.EnterWriteLock();
genericDictionary[i] = new KeyValuePair<string,
string>(string.Format("Sub Key {0}", i), genericDictionary[i].Value);
}
finally
{
genericDictionaryLock.ExitWriteLock();
}
}
else
{
throw new Exception();
}
}
finally
{
genericDictionaryLock.ExitUpgradeableReadLock();
}
And here is a lookup by sub-key:
genericDictionaryLock.EnterReadLock();
try
{
var val = (from d in genericDictionary.Values
where d.Key == string.Format("Sub Key {0}", i) select d.Value).Single();
Debug.Assert(val == string.Format("Value {0}", i));
}
finally
{
genericDictionaryLock.ExitReadLock();
}
What you should notice is that the lookups and assignments are very expensive, creating many more objects that you really need.
The Tuple
method is similar to the above snippets, and while it is faster than the Dictionary<int, KeyValuePair<string, string>>
, it is still nowhere near as fast as indexing directly into the collection to the desired value based on a hashed key, as is done in the MultiKeyDictionary
class.
As easy as it might be to do, the poor performance of the other options precludes actually using them in any application that requires speedy access to the underlying data. The following image is from the most average 5 runs out of about 20, but you can always run the code for yourself (see the bottom of this post).
Also, keep in mind that these tests were done with data sets of 100 records. When you start going to larger data sets, say around 1000000, the differences in timing appear to grow exponentially. For instance, using a data set of 1000 records, it takes the generic dictionary 140 seconds to complete, it takes the tuple 45 seconds, and the MultiKeyDictionary .12 seconds.
Times below are in ticks. (Click to enlarge)
The full code for my performance test is below... If you want to use it, you'll need to have my MultiKeyDictionary
class.
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using System.Linq;
using Aron.Weiler;
using System.Collections.Generic;
namespace MultiKeyGenericDictionarySpike
{
class Program
{
static MultiKeyDictionary<int, string, string> mkDictionary =
new MultiKeyDictionary<int, string, string>();
static List<Tuple<string, string, string>> tuples =
new List<Tuple<string, string, string>>();
static ReaderWriterLockSlim tupleLock = new ReaderWriterLockSlim();
static Dictionary<int, KeyValuePair<string, string>> genericDictionary =
new Dictionary<int, KeyValuePair<string, string>>();
static ReaderWriterLockSlim genericDictionaryLock = new ReaderWriterLockSlim();
static CountdownEvent countDown;
static void Main(string[] args)
{
perfTest:
Console.WriteLine("------------------");
Console.WriteLine("");
Console.WriteLine("Performance Test Running...");
Console.WriteLine("");
RunTuplePerfTest();
RunGenericDictionaryPerfTest();
RunMultiKeyDictionaryPerfTest();
Console.WriteLine("Press any key to quit, or P to run perf test again...");
if (Console.ReadKey().Key == ConsoleKey.P)
goto perfTest;
}
private static void RunTuplePerfTest()
{
Task[] tasks = SetUpPerfTask(TuplePerfTest);
RunTasks(tasks, "Tuple");
}
private static void RunMultiKeyDictionaryPerfTest()
{
Task[] tasks = SetUpPerfTask(MultiKeyDictionaryPerfTest);
RunTasks(tasks, "MK Dictionary");
}
private static void RunGenericDictionaryPerfTest()
{
Task[] tasks = SetUpPerfTask(GenericDictionaryPerfTest);
RunTasks(tasks, "Generic Dictionary");
}
private static void RunTasks(Task[] tasks, string name)
{
Stopwatch sw = ResetStopWatchAndCountdown();
sw.Start();
foreach (Task task in tasks)
{
task.Start();
}
countDown.Wait();
sw.Stop();
Console.WriteLine("Done in {0} ticks ({1} seconds) for {2}",
sw.ElapsedTicks, sw.Elapsed.TotalSeconds, name);
}
private static Task[] SetUpPerfTask(Action<int, int> perfTest)
{
int seed = 100;
int count = 99;
Task[] tasks = new Task[10];
for (int i = 0; i < 10; i++)
{
int temp = i;
tasks[i] = new Task(() => perfTest(seed * temp, count));
}
return tasks;
}
private static Stopwatch ResetStopWatchAndCountdown()
{
Stopwatch sw = new Stopwatch();
countDown = new CountdownEvent(10);
return sw;
}
private static void MultiKeyDictionaryPerfTest(int seed, int count)
{
Parallel.For(seed, seed + count, (i) =>
{
mkDictionary.Add(i, string.Format("Sub Key {0}", i),
string.Format("Value {0}", i));
});
Parallel.For(seed, seed + count, (i) =>
{
Debug.Assert(mkDictionary[i] == (string.Format("Value {0}", i)));
});
Parallel.For(seed, seed + count, (i) =>
{
mkDictionary.Associate(string.Format("Sub Key {0}", i), i);
});
Parallel.For(seed, seed + count, (i) =>
{
Debug.Assert(mkDictionary[string.Format("Sub Key {0}", i)] ==
(string.Format("Value {0}", i)));
});
Parallel.For(seed, seed + count, (i) =>
{
mkDictionary.Remove(i);
});
countDown.Signal();
}
private static void GenericDictionaryPerfTest(int seed, int count)
{
Parallel.For(seed, seed + count, (i) =>
{
genericDictionaryLock.EnterWriteLock();
try
{
genericDictionary.Add(i, new KeyValuePair<string, string>
(string.Format("Sub Key {0}", i), string.Format("Value {0}", i)));
}
finally
{
genericDictionaryLock.ExitWriteLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
KeyValuePair<string, string> kvp = new KeyValuePair<string, string>
(string.Format("Sub Key {0}", i), string.Format("Value {0}", i));
genericDictionaryLock.EnterReadLock();
try
{
Debug.Assert(genericDictionary[i].Equals(kvp));
}
finally
{
genericDictionaryLock.ExitReadLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
genericDictionaryLock.EnterUpgradeableReadLock();
try
{
if (genericDictionary.ContainsKey(i))
{
try
{
genericDictionaryLock.EnterWriteLock();
genericDictionary[i] = new KeyValuePair<string, string>
(string.Format("Sub Key {0}", i), genericDictionary[i].Value);
}
finally
{
genericDictionaryLock.ExitWriteLock();
}
}
else
{
throw new Exception();
}
}
finally
{
genericDictionaryLock.ExitUpgradeableReadLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
genericDictionaryLock.EnterReadLock();
try
{
var val = (from d in genericDictionary.Values where d.Key ==
string.Format("Sub Key {0}", i) select d.Value).Single();
Debug.Assert(val == string.Format("Value {0}", i));
}
finally
{
genericDictionaryLock.ExitReadLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
genericDictionaryLock.EnterUpgradeableReadLock();
try
{
if (genericDictionary.ContainsKey(i))
{
try
{
genericDictionaryLock.EnterWriteLock();
genericDictionary.Remove(i);
}
finally
{
genericDictionaryLock.ExitWriteLock();
}
}
else
{
throw new Exception();
}
}
finally
{
genericDictionaryLock.ExitUpgradeableReadLock();
}
});
countDown.Signal();
}
private static void TuplePerfTest(int seed, int count)
{
Parallel.For(seed, seed + count, (i) =>
{
tupleLock.EnterWriteLock();
try
{
tuples.Add(new Tuple<string, string, string>(i.ToString(),
string.Format("Sub Key {0}", i), string.Format("Value {0}", i)));
}
finally
{
tupleLock.ExitWriteLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
tupleLock.EnterReadLock();
try
{
Debug.Assert((from t in tuples where t.Item1 ==
string.Format("{0}", i) select t).Count() > 0);
}
finally
{
tupleLock.ExitReadLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
tupleLock.EnterUpgradeableReadLock();
try
{
Tuple<string, string, string> item = (from t in tuples
where t.Item1 == string.Format("{0}", i) select t).Single();
item = new Tuple<string, string, string>(item.Item1,
string.Format("Sub Key {0}", i), item.Item3);
try
{
tupleLock.EnterWriteLock();
tuples.Remove(item);
tuples.Add(item);
}
finally
{
tupleLock.ExitWriteLock();
}
}
finally
{
tupleLock.ExitUpgradeableReadLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
tupleLock.EnterReadLock();
try
{
Debug.Assert((from t in tuples where t.Item2 ==
string.Format("Sub Key {0}", i) select t).Count() > 0);
}
finally
{
tupleLock.ExitReadLock();
}
});
Parallel.For(seed, seed + count, (i) =>
{
tupleLock.EnterUpgradeableReadLock();
try
{
Tuple<string, string, string> item =
(from t in tuples where t.Item1 == string.Format("{0}", i)
select t).Single();
try
{
tupleLock.EnterWriteLock();
tuples.Remove(item);
}
finally
{
tupleLock.ExitWriteLock();
}
}
finally
{
tupleLock.ExitUpgradeableReadLock();
}
});
countDown.Signal();
}
}
}