Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Looking up items in HashTable/Dictionary objects that have multiple keys

4.20/5 (8 votes)
1 May 2008CPOL4 min read 1   589  
Dictionary objects take a single key as a look up key. This class simplifies using a Dictionary when you have multiple keys, such as two strings and an int, etc.

Introduction

Dictionary objects take a single key as a look up key. This class simplifies using a Dictionary when you have multiple keys, such as two strings and an int, etc. Use this class when just tacking all the keys into a single string and using that as a key makes you feel dirty.

Here is an example of how this class can be used, with a sample class called TestClass. Define a new class for the key and, in it, implement the abstract method GetKeyValues which will return an array of values to use as the key. In that method, just return the properties you want to use as the key for looking up the object in a Dictionary.

C#
/// <summary>
/// Define test class to use key for
/// </summary>
public class TestClass
{
    public string Column1 = null;
    public string Column2 = null;
    public TestClass(string Column1, string Column2)
    {
        this.Column1 = Column1;
        this.Column2 = Column2;
    }
}

//define key to use for test class
public class TestClassKey : ClassKey<TestClass>
{
    //Init with object
    public TestClassKey(TestClass ClassReference) : base(ClassReference) { }
    //return list of column values we need to use as a key
    public override object[] GetKeyValues()
    {
        return new object[] { 
            ClassReference.Column1, 
            ClassReference.Column2
        };
    }
}

And, here is an example of a unit test confirming that it does indeed work:

C#
TestClass model1 = new TestClass("abc", "def");
TestClass model2 = new TestClass("abc", "def");

Assert.AreEqual(new TestClassKey(model1), new TestClassKey(model2));
Assert.IsTrue(new TestClassKey(model1) == new TestClassKey(model2));

//change side of one and make sure not equal
model1.Column1 = "xyz";
model2.Column2 = "123";

Assert.AreNotEqual(new TestClassKey(model1), new TestClassKey(model2));
Assert.IsTrue(new TestClassKey(model1) != new TestClassKey(model2));

Using the code

Since this is for use primarily on Dictionary objects, I had to get very familiar with equality overriding and the GetHashCode method. This is something you want to do once and in one place as it is very, very easy to get wrong. I've been using this in production systems for more than a few months, and have added tests for various bugs I've seen - so I'm pretty confident this code works well. I'm posting it here because I'm curious what other people think, or if there is an easier way to do this. Here is the entire code for ClassKey.cs:

C#
/// <summary>
/// Defines a common set of operations and functionality for creating concrete 
/// key classes which allow us to lookup items in a collection
/// using one or more of the properties in that collection.
/// </summary>
public abstract class ClassKey<T> where T : class
{
    /// <summary>
    /// The collection item referenced by this key
    /// </summary>
    public T ClassReference
    {
        get { return _CollectionItem; }
        set { _CollectionItem = value; }
    }

    private T _CollectionItem = null;

    /// <summary>
    /// Init empty if needed
    /// </summary>
    public ClassKey() { }

    /// <summary>
    /// Init with specific collection item
    /// </summary>
    /// <param name="CollectionItem"></param>
    public ClassKey(T CollectionItem)
    {
        this.ClassReference = CollectionItem;
    }

    /// <summary>
    /// Compare based on hash code
    /// </summary>
    /// <param name="obj"></param>
    /// <returns></returns>
    public override bool Equals(object obj)
    {
        if (obj is ClassKey<T>)
        {
            return (obj as ClassKey<T>).GetHashCode() == this.GetHashCode();
        }
        else
            return false; //definitely not equal
    }

    public static bool operator ==(ClassKey<T> p1, ClassKey<T> p2)
    {
        //if both null, then equal
        if ((object)p1 == null && (object)p2 == null) return true;
        //if one or other null, then not since above 
        //we guaranteed if here one is not null
        if ((object)p1 == null || (object)p2 == null) return false;
        //compare on fields
        return (p1.Equals(p2));
    }

    public static bool operator !=(ClassKey<T> p1, ClassKey<T> p2)
    {
        return !(p1 == p2);
    }

    //must override to get list of key values
    public abstract object[] GetKeyValues();

       /// <summary>
       /// Implement hash code function to specify 
       /// which columns will be used for the key
       /// without using reflection which may be a bit slow.
    /// </summary>
    /// <returns></returns>
    public override int GetHashCode()
    {
       object[] keyValues = GetKeyValues();
       //use co-prime numbers to salt the hashcode 
       //so same values in different order will 
       //not return as equal - see TestClassKeyXOROrderProblem
       //                      to reproduce problem 
       //http://www.msnewsgroups.net/group/microsoft
       //          .public.dotnet.languages.csharp/topic36405.aspx
       //http://directxinfo.blogspot.com/2007/06/gethashcode-in-net.html

       //first co-prime number
       int FinalHashCode = 17;
       //other co-prime number - ask me if I know what 
       //co-prime means, go ahead, ask me.
       int OtherCoPrimeNumber = 37;
       //get total hashcode to return
       if(keyValues != null)
           foreach (object keyValue in keyValues)
           {
               //can't get hash code if null
               if (keyValue != null)
               {
                   FinalHashCode = FinalHashCode * 
                         OtherCoPrimeNumber + keyValue.GetHashCode();
               }
           }
           return FinalHashCode;
    }
}

Points of interest

You can also inherit directly from ClassKey if you don't want to use a separate class for comparisons. It is also a handy class to use if you want to override the == or GetHashCode methods so you don't have to remember all the little details involved in overriding all the methods that need to be done any time you touch any one of them. I would be interested to hear from people if the portions dealing with prime and co-prime numbers can be done in a different or better way.

Struct implementation

I added another download file which has a struct only implementation of the key, after reading one of the reader comments. I dislike the implementation because the contents of the key have to be specified multiple times, i.e., each time it is used since there isn't inheritance with structs. That being said, it does appear to be about 15-20% faster for lookups, but on a million rows of lookups, that ends up being 1200ms vs. 1000ms, so I still prefer the class/inheritance method.

Performance measurements on various methods

I did the below by running the various unit tests for the different methods. I tried to keep all of them more or less the same to try and keep it fair. To measure the memory, I simply killed ProcessInvocation.exe (the TestDriven.NET test runner), and ran the perf test which does a million iterations 10 times and averages the results for initialzation and lookup. No matter which way you go, there is a tradeoff. I still prefer the ClassKey method. Though it takes a bit longer (200ms on a million rows), I think it makes bugs far less likely to appear, and is much more intuitive. The Struct implementation takes a bit longer to initialize, but is faster for lookups, but less maintainable. The Dictionary of Dictionaries is the fastest for lookups, but takes longer to initialize and uses twice as much memory as the ClassKey method - presumably because it is creating another dictionary object for each item in the list. I also consider it to be the worst syntax and maintainability-wise. The concatenated string key isn't too terrible for performance, so if you were lazy and not wanting to implement something like this, then I think that would be the way to go as long as you have a common method for constructing the key that can be reused (and not specified on every use). It also took a significantly larger amount of memory though. These numbers aren't guaranteed or perfect, just some back of the envelope measurements I'm using on my system to have some basis for comparison.

Class key

Initialization: 3,018ms
Lookups: 1,144ms
Memory - Never above 313MB

Struct

Initialization: 3,210ms
Lookups: 1,064ms
Memory - Never above 354MB

Dictionary of Dictionaries

Initialization: 4,313ms
Lookups: 919ms
Memory - Never above 555MB

Concatenated string key dictionary

Initialization: 3,305ms
Lookups: 1,039ms
Memory - Never above 460MB

Tuple method

Initialization: 3,810ms
Lookups: 3,241ms
Memory - Never above 316MB

History

  • 22-Apr-2008
    • Initial version.
    • Fought with formatting, got sick of dealing with the CodeProject editor.
  • 23-Apr-2008
    • Added new zip file with struct implementation and unit test to run 1 million times.
    • Added new zip file with straight dictionary implementations and all of the above.
  • 30-Apr-2008
    • Added code download for unchecked/tuple discussions.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)