Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / DevOps / testing

Performance Tests Between Multiple multi-key Dictionary Implementations

5.00/5 (1 vote)
24 May 2011CPOL2 min read 23.2K  
Performance tests between multiple multi-key Dictionary implementations

In relation to my post about the MultiKeyDictionary object that I created (you can read about it on CodeProject here, or on my blog here), I created some performance tests to illustrate the difference between my method of creating this class, vs. some alternatives.

For instance, you could use Tuple<int, string, string>, or even Dictionary<int, KeyValuePair<string, string>> as the underlying dictionary, iterating over the collections using LINQ. The functionality would be there, no doubt about that.

Unfortunately, when you use LINQ for much of anything, the performance immediately goes downhill. But just to make sure that I wasn't crazy, before I started writing this blog entry, I put together a performance comparison using the aforementioned objects for storage - of course, benchmarking them against my MultiKeyDictionary.

The results are pretty cut & dry - the MultiKeyGenericDictionary outperforms the other two methods by a factor of about 1000x for the Dictionary<int, KeyValuePair<string, string>, and is about 400x faster than the Tuple<int, string, string>. Also, once you factor in synchronizing massively multi-threaded access to the underlying objects, you quickly find out that the comparison between the methods are not even close.

Take a look at these code snippets on the access to the underlying objects - keep in mind that LINQ is incredibly expensive, and doesn't parallelize well when you have multiple threads running multiple LINQ queries on the same object, all trying to parallelize their queries. In fact, you're better off not using the AsParallel() extension.

This is the equivalent of the MultiKeyDictionary Associate method:

C#
genericDictionaryLock.EnterUpgradeableReadLock();

try
{
    if (genericDictionary.ContainsKey(i))
    {
        try
        {
            genericDictionaryLock.EnterWriteLock();

            // Notice that you have to create a new KeyValuePair 
            // in order to associate the sub-key, 
            // since KVPs can only have their values set in the constructor.
            genericDictionary[i] = new KeyValuePair<string, 
		string>(string.Format("Sub Key {0}", i), genericDictionary[i].Value);
        }
        finally
        {
            genericDictionaryLock.ExitWriteLock();
        }
    }
    else
    {
        // Realistic error checking/handling would usually go here
        throw new Exception(); 
    }
}
finally
{
    genericDictionaryLock.ExitUpgradeableReadLock();
}

And here is a lookup by sub-key:

C#
genericDictionaryLock.EnterReadLock();

try
{
   var val = (from d in genericDictionary.Values 
	where d.Key == string.Format("Sub Key {0}", i) select d.Value).Single();

   Debug.Assert(val == string.Format("Value {0}", i));
}
finally
{
   genericDictionaryLock.ExitReadLock();
}

What you should notice is that the lookups and assignments are very expensive, creating many more objects that you really need.

The Tuple method is similar to the above snippets, and while it is faster than the Dictionary<int, KeyValuePair<string, string>>, it is still nowhere near as fast as indexing directly into the collection to the desired value based on a hashed key, as is done in the MultiKeyDictionary class.

As easy as it might be to do, the poor performance of the other options precludes actually using them in any application that requires speedy access to the underlying data. The following image is from the most average 5 runs out of about 20, but you can always run the code for yourself (see the bottom of this post).

Also, keep in mind that these tests were done with data sets of 100 records. When you start going to larger data sets, say around 1000000, the differences in timing appear to grow exponentially. For instance, using a data set of 1000 records, it takes the generic dictionary 140 seconds to complete, it takes the tuple 45 seconds, and the MultiKeyDictionary .12 seconds.

Times below are in ticks. (Click to enlarge)

Image 1

The full code for my performance test is below... If you want to use it, you'll need to have my MultiKeyDictionary class.

C#
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using System.Linq;
using Aron.Weiler;
using System.Collections.Generic;

namespace MultiKeyGenericDictionarySpike
{
    class Program
    {
        // Instantiate classes with a value (string), 
        // primary key (int) and sub key (string)
        static MultiKeyDictionary<int, string, string> mkDictionary = 
			new MultiKeyDictionary<int, string, string>();

        static List<Tuple<string, string, string>> tuples = 
			new List<Tuple<string, string, string>>();
        static ReaderWriterLockSlim tupleLock = new ReaderWriterLockSlim();

        static Dictionary<int, KeyValuePair<string, string>> genericDictionary = 
			new Dictionary<int, KeyValuePair<string, string>>();
        static ReaderWriterLockSlim genericDictionaryLock = new ReaderWriterLockSlim();

        static CountdownEvent countDown;

        static void Main(string[] args)
        {
        perfTest:

            Console.WriteLine("------------------");
            Console.WriteLine("");

            Console.WriteLine("Performance Test Running...");
            Console.WriteLine("");

            // Slowest to fastest ;)
            RunTuplePerfTest();
            RunGenericDictionaryPerfTest();            
            RunMultiKeyDictionaryPerfTest();

            Console.WriteLine("Press any key to quit, or P to run perf test again...");
            if (Console.ReadKey().Key == ConsoleKey.P)
                goto perfTest;
        }

        private static void RunTuplePerfTest()
        {
            Task[] tasks = SetUpPerfTask(TuplePerfTest);

            RunTasks(tasks, "Tuple");
        }

        private static void RunMultiKeyDictionaryPerfTest()
        {
            Task[] tasks = SetUpPerfTask(MultiKeyDictionaryPerfTest);

            RunTasks(tasks, "MK Dictionary");
        }

        private static void RunGenericDictionaryPerfTest()
        {
            Task[] tasks = SetUpPerfTask(GenericDictionaryPerfTest);

            RunTasks(tasks, "Generic Dictionary");
        }

        private static void RunTasks(Task[] tasks, string name)
        {
            Stopwatch sw = ResetStopWatchAndCountdown();

            sw.Start();

            foreach (Task task in tasks)
            {
                task.Start();
            }

            countDown.Wait();

            sw.Stop();

            Console.WriteLine("Done in {0} ticks ({1} seconds) for {2}", 
			sw.ElapsedTicks, sw.Elapsed.TotalSeconds, name);
        }

        private static Task[] SetUpPerfTask(Action<int, int> perfTest)
        {
            int seed = 100;
            int count = 99;

            Task[] tasks = new Task[10];

            for (int i = 0; i < 10; i++)
            {
                int temp = i;

                tasks[i] = new Task(() => perfTest(seed * temp, count));
            }

            return tasks;
        }

        private static Stopwatch ResetStopWatchAndCountdown()
        {
            Stopwatch sw = new Stopwatch();
            countDown = new CountdownEvent(10);
            return sw;
        }

        private static void MultiKeyDictionaryPerfTest(int seed, int count)
        {
            Parallel.For(seed, seed + count, (i) =>
            {
                mkDictionary.Add(i, string.Format("Sub Key {0}", i), 
					string.Format("Value {0}", i));
            });

            Parallel.For(seed, seed + count, (i) =>
            {
                Debug.Assert(mkDictionary[i] == (string.Format("Value {0}", i)));
            });

            Parallel.For(seed, seed + count, (i) =>
            {
                mkDictionary.Associate(string.Format("Sub Key {0}", i), i);
            });

            Parallel.For(seed, seed + count, (i) =>
            {
                Debug.Assert(mkDictionary[string.Format("Sub Key {0}", i)] == 
					(string.Format("Value {0}", i)));
            });

            Parallel.For(seed, seed + count, (i) =>
            {
                mkDictionary.Remove(i);
            });

            countDown.Signal();
        }

        private static void GenericDictionaryPerfTest(int seed, int count)
        {
            Parallel.For(seed, seed + count, (i) =>
            {
                genericDictionaryLock.EnterWriteLock();

                try
                {
                    genericDictionary.Add(i, new KeyValuePair<string, string>
		  (string.Format("Sub Key {0}", i), string.Format("Value {0}", i)));
                }
                finally
                {
                    genericDictionaryLock.ExitWriteLock();
                }                
            });

            Parallel.For(seed, seed + count, (i) =>
            {
                // Have to create a new KVP each time
                KeyValuePair<string, string> kvp = new KeyValuePair<string, string>
		(string.Format("Sub Key {0}", i), string.Format("Value {0}", i));

                genericDictionaryLock.EnterReadLock();

                try
                {
                    // == doesn't appear to work...
                    Debug.Assert(genericDictionary[i].Equals(kvp));
                }
                finally
                {
                    genericDictionaryLock.ExitReadLock();
                }
            });

            // Associate
            Parallel.For(seed, seed + count, (i) =>
            {
                genericDictionaryLock.EnterUpgradeableReadLock();

                try
                {
                    if (genericDictionary.ContainsKey(i))
                    {
                        try
                        {
                            genericDictionaryLock.EnterWriteLock();

                            genericDictionary[i] = new KeyValuePair<string, string>
			(string.Format("Sub Key {0}", i), genericDictionary[i].Value);
                        }
                        finally
                        {
                            genericDictionaryLock.ExitWriteLock();
                        }
                    }
                    else
                    {
                        // Realistic error checking/handling would usually go here
                        throw new Exception(); 
                    }
                }
                finally
                {
                    genericDictionaryLock.ExitUpgradeableReadLock();
                }
            });

            // Lookup by subkey
            Parallel.For(seed, seed + count, (i) =>
            {
                genericDictionaryLock.EnterReadLock();

                try
                {
                    var val = (from d in genericDictionary.Values where d.Key == 
			string.Format("Sub Key {0}", i) select d.Value).Single();

                    Debug.Assert(val == string.Format("Value {0}", i));
                }
                finally
                {
                    genericDictionaryLock.ExitReadLock();
                }
            });

            Parallel.For(seed, seed + count, (i) =>
            {
                genericDictionaryLock.EnterUpgradeableReadLock();

                try
                {
                    // Realistic error checking
                    if (genericDictionary.ContainsKey(i))
                    {
                        try
                        {
                            genericDictionaryLock.EnterWriteLock();

                            genericDictionary.Remove(i);
                        }
                        finally
                        {
                            genericDictionaryLock.ExitWriteLock();
                        }
                    }
                    else
                    {
                        throw new Exception();
                    }
                }
                finally
                {
                    genericDictionaryLock.ExitUpgradeableReadLock();
                }
            });

            countDown.Signal();
        }

        /// <summary>
        /// For the purpose of actually checking the performance of this list, 
        /// I will have a string primary key, forcing me to use LINQ to look up my values.
        /// </summary>
        /// <param name="seed"></param>
        /// <param name="count"></param>
        private static void TuplePerfTest(int seed, int count)
        {
            // Add all of the primary keys, sub keys and values
            Parallel.For(seed, seed + count, (i) =>
            {
                tupleLock.EnterWriteLock();

                try
                {
                    tuples.Add(new Tuple<string, string, string>(i.ToString(), 
		   string.Format("Sub Key {0}", i), string.Format("Value {0}", i)));
                }
                finally
                {
                    tupleLock.ExitWriteLock();
                }                    
            });

            // Verify that the correct value exists for each item
            Parallel.For(seed, seed + count, (i) =>
            {
                tupleLock.EnterReadLock();

                try
                {
                    Debug.Assert((from t in tuples where t.Item1 == 
			string.Format("{0}", i) select t).Count() > 0);
                }
                finally
                {
                    tupleLock.ExitReadLock();
                }    
            });

            // Re-associate the sub-key
            Parallel.For(seed, seed + count, (i) =>
            {
                tupleLock.EnterUpgradeableReadLock();

                try
                {
                    // Can't just associate with this tuple, as it is readonly, 
                    // have to create a new one.
                    // Separating the reading and writing here
                    Tuple<string, string, string> item = (from t in tuples 
		   where t.Item1 == string.Format("{0}", i) select t).Single();
                    item = new Tuple<string, string, string>(item.Item1, 
			string.Format("Sub Key {0}", i), item.Item3);

                    try
                    {
                        tupleLock.EnterWriteLock();

                        tuples.Remove(item);
                        tuples.Add(item);
                    }
                    finally
                    {
                        tupleLock.ExitWriteLock();
                    }
                }
                finally
                {
                    tupleLock.ExitUpgradeableReadLock();
                }                
            });

            // Verify that I can get the correct values using the subkey 
            // as the key into the tuple
            Parallel.For(seed, seed + count, (i) =>
            {    
                tupleLock.EnterReadLock();

                try
                {
                    Debug.Assert((from t in tuples where t.Item2 == 
			string.Format("Sub Key {0}", i) select t).Count() > 0);
                }
                finally
                {
                    tupleLock.ExitReadLock();
                }
            });

            Parallel.For(seed, seed + count, (i) =>
            {
                tupleLock.EnterUpgradeableReadLock();

                try
                {
                    // Separating the reading and writing here
                    Tuple<string, string, string> item = 
			(from t in tuples where t.Item1 == string.Format("{0}", i) 
			select t).Single();

                    try
                    {
                        tupleLock.EnterWriteLock();

                        tuples.Remove(item);
                    }
                    finally
                    {
                        tupleLock.ExitWriteLock();
                    }
                }
                finally
                {
                    tupleLock.ExitUpgradeableReadLock();
                }
            });

            countDown.Signal();
        }
    }
}

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)