Introduction
Primary Keys are very usually auto-incrementing integers or GUIDs. Unfortunately, many domains need much more than that. Primary keys are not defined as single column but just as unique. Sooner rather than later you'll find yourself in the need for composite primary keys. When using composite keys, it is very useful being able to apply equality functions to key values. Actually, NHibernate requires composite keys to override equality members (Equals
and GetHashCode
). This article is about making the process of implementing equality functions painless.
Contents
- The Model
- Automatic Equality Functions
- Using the Code
- Statistics
- Conclusions
- Useful Links
The Model
Let's take a look at the following Domain.
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
assembly="Myotragus.Data.Tupples.Tests"
namespace="Myotragus.Data.Tupples.Tests.Domain">
<class name="Product">
<id name="Id" column="ProductId">
<generator class="identity"/>
</id>
<property name="Name"/>
</class>
<class name="Category">
<id name="Id" column="CategoryId">
<generator class="identity"/>
</id>
<property name="Name"/>
</class>
<class name="CategoryProducts">
<composite-id name="Key">
<key-property name="ProductId"/>
<key-property name="CategoryId"/>
</composite-id>
<property name="CustomDescription"/>
</class>
</hibernate-mapping>
Code excerpt 1: NHibernate mapping for sample Domain
In the previous code, a three entities Domain is defined. Product
and Category
need no explanation. CategoryProducts
, on the other hand, does. If you are experienced with NHibernate, you'd probably use a collection within both Product
and Category
to represent many-to-many relationships. I prefer to leave my POCOs clear of relationships, but that's me. For the sake of this example, we're going to use this mapping the way I would do it in real life. Let's now take a look at the POCOs.
public class Category
{
public virtual int Id { get; set; }
public virtual string Name { get; set; }
}
public class Product
{
public virtual int Id { get; set; }
public virtual string Name { get; set; }
}
Code excerpt 2: Simple POCOs: Product and Category
public class CategoryProducts
{
public CategoryProducts()
{
Key = new CategoryProductsKey() ;
}
public virtual CategoryProductsKey Key { get; set; }
public virtual int ProductId
{
get { return Key.ProductId ;}
set { Key.ProductId = value ; }
}
public virtual int CategoryId
{
get { return Key.CategoryId;}
set { Key.CategoryId = value ;}
}
public virtual string CustomDescription { get;set;}
}
Code excerpt 3: Not so simple POCO
CategoryProducts
uses a composite key with two fields, one references Product
, the other, Category
. NHibernate forces to override the equality member in composite key types. Now let's take a look at the key implementation.
public class CategoryProductsKey
{
public int ProductId { get; set; }
public int CategoryId { get; set; }
public override in GetHashCode()
{
return ProductId ^ CategoryId ;
}
public override Equals(object x)
{
return Equals(x as CategoryProductsKey) ;
}
public bool Equals(CategoryProductsKey x)
{
return x != null && x.ProductId == ProductId &&
x.CategoryId == CategoryId ;
}
}
Code excerpt 4: Composite key implementation
As you can see, the equality members implementation has been very simple. Actually, it is extremely straightforward in most cases.
Automatic Equality Functions
Let's now formally (C#) define a straightforward equality implementation for a composite key.
public bool AreEqual(TKey x, TKey)
{
var result = true ;
foreach(var property in typeof(TKey).GetProperties(All))
result &= object.Equals(property.GetValue(x), property.GetValue(y));
return result ;
}
Code excerpt 5: Straightforward Equals implementation
A couple of optimizations could be done, but right now they are not important.
public in GetHashCode(TKey x)
{
var getHashCodeMethod = typeof(object).GetMethod("GetHashCode") ;
var result = 0;
foreach(var property in typeof(TKey).GetProperties(All))
return ^= getHashCodeMethod(property.GetValue(x));
return result ;
}
Code excerpt 6: Straightforward GetHashCode implementation
Using the Code
Got the idea, right?? What we're going to do now is to encapsulate this definition in a class. This class would generate equality functions, clients later would be able to use it to compare composite keys. Using it would look just like:
var o1 = new TKey { P1 = v11, P2 = v21, P3 = v31 } ;
var o2 = new TKey { P1 = v21, P2 = v22, P3 = v33 } ;
Func<TKey, TKey, bool> AreEquals =
EqualityFunctionsGenerator<TKey>.CreateEqualityComparer();
var r = AreEquals(o1, o2) ;
Func<TKey, int> GetHashCode = EqualityFunctionsGenerator<TKey>.CreateGetHashCode();
var c1 = GetHashCode(o1) ;
var c2 = GetHashCode(o2) ;
Code excerpt 7: Using automatically generated functions
You could think of thousands of uses for it, but actually the best use you could give to these functions would be overriding an object's definition and letting clients do the rest.
public class Tupple<TObject> : IEquatable<TObject>
where TObject : class
{
private static readonly Func<TObject, int> GetHashCodeMethod =
EqualityFunctionsGenerator<TObject>.CreateGetHashCode();
private static readonly Func<TObject, TObject, bool> EqualsMethod =
EqualityFunctionsGenerator<TObject>.CreateEqualityComparer();
public override bool Equals(object obj)
{
return Equals(obj as TObject);
}
public override int GetHashCode()
{
var @this = ((object)this) as TObject;
if (@this == null) return 0 ;
return GetHashCodeMethod(@this);
}
public bool Equals(TObject other)
{
var @this = ((object)this) as TObject ;
if (other == null || @this == null) return false ;
return EqualsMethod(@this, other);
}
}
Code excerpt 8: Implementation of Tupple
Extending a tupple would make everything work.
public class CategoryProductsKey : Tupple<CategoryProductsKey>
{
public virtual int ProductId { get; set; }
public virtual int CategoryId { get; set; }
}
Code excerpt 9: Wow! That was easy
Just missing the whole implementation for the functions generator, here it is:
public class EqualityFunctionsGenerator<TObject>
{
public static readonly Type TypeOfTObject = typeof(TObject);
public static readonly Type TypeOfBool = typeof(bool);
public static readonly MethodInfo MethodEquals =
typeof(object).GetMethod("Equals",
BindingFlags.Static | BindingFlags.Public);
public static readonly MethodInfo MethodGetHashCode =
typeof(object).GetMethod("GetHashCode",
BindingFlags.Instance | BindingFlags.Public);
public static Func<TObject, TObject, bool> CreateEqualityComparer()
{
var x = Expression.Parameter(TypeOfTObject, "x");
var y = Expression.Parameter(TypeOfTObject, "y");
var result = (Expression)Expression.Constant(true, TypeOfBool);
foreach (var property in GetProperties())
{
var comparison = CreatePropertyComparison(property, x, y);
result = Expression.AndAlso(result, comparison);
}
return Expression.Lambda<Func<TObject, TObject, bool>>(result, x, y).Compile();
}
private static Expression CreatePropertyComparison(PropertyInfo property,
Expression x, Expression y)
{
var type = property.PropertyType;
var propertyOfX = GetPropertyValue(x, property);
var propertyOfY = GetPropertyValue(y, property);
return (type.IsValueType)? CreateValueTypeComparison(propertyOfX, propertyOfY)
:CreateReferenceTypeComparison(propertyOfX, propertyOfY);
}
private static Expression GetPropertyValue(Expression obj, PropertyInfo property)
{
return Expression.Property(obj, property);
}
private static Expression CreateReferenceTypeComparison(Expression x, Expression y)
{
return Expression.Call(MethodEquals, x, y);
}
private static Expression CreateValueTypeComparison(Expression x, Expression y)
{
return Expression.Equal(x, y);
}
public static IEnumerable<PropertyInfo> GetProperties()
{
return TypeOfTObject.GetProperties(BindingFlags.Instance | BindingFlags.Public);
}
public static Func<TObject, int> CreateGetHashCode()
{
var obj = Expression.Parameter(TypeOfTObject, "obj");
var result = (Expression)Expression.Constant(0);
foreach (var property in GetProperties())
{
var hash = CreatePropertyGetHashCode(obj, property);
result = Expression.ExclusiveOr(result, hash);
}
return Expression.Lambda<Func<TObject, int>(result, obj).Compile();
}
private static Expression CreatePropertyGetHashCode(Expression obj, PropertyInfo property)
{
var type = property.PropertyType;
var propertyOfObj = GetPropertyValue(obj, property);
return type.IsValueType ? CreateValueTypeGetHashCode(propertyOfObj)
: CreateReferenceTypeGetHashCode(propertyOfObj);
}
private static Expression CreateReferenceTypeGetHashCode(Expression value)
{
return Expression.Condition(
Expression.Equal(Expression.Constant(null), value),
Expression.Constant(0),
Expression.Call(value, MethodGetHashCode));
}
private static Expression CreateValueTypeGetHashCode(Expression value)
{
return Expression.Call(value, MethodGetHashCode);
}
private static Expression CheckForNull(Expression value)
{
return Expression.Condition(
Expression.Equal(Expression.Constant(null), value),
Expression.Constant(0),
value);
}
}
Code excerpt 10: Equality functions generator
Statistics
What is it good for if it is slow? I ran some performance test, and the results were not as good as I expected, but they were good enough. Tests were run on a 1 CPU/1GB VirtualBox virtual machine, over a Phenom II x6 1055T 2.46Ghz. Two tupple types were created to perform the tests upon, an automatic and a manually implemented. Performance is shown along with the equivalent test over int
, Point
, and a manually implemented tupple.
public class AutomaticCompositeKey : Tupple<AutomaticCompositeKey>
{
public string KeyField1 { get; set; }
public int KeyField2 { get; set; }
public int KeyField3 { get; set; }
}
public class ImplementedCompositeKey
{
public string KeyField1 { get; set; }
public int KeyField2 { get; set; }
public int KeyField3 { get; set; }
public override int GetHashCode {...}
public override bool Equals(object x) {...}
}
Code excerpt 11: Tupple made for testing
Equals Test Results
Test | 10M Cases | 100M Cases | 1G Cases |
int equality test | 0 | 0.04 | 0.566 |
Point equality using == operator | 0.01 | 0.1 | 1.398 |
Point equality using Equals | 0.08 | 0.671 | 8.12 |
Manually implemented tupple | 0.06 | 0.491 | 5.31 |
Automatic tupple | 0.19 | 1.122 | 13.2 |
GetHashCode Test Results
Test | 10M Cases | 100M Cases | 1G Cases |
Point | 0.01 | 0.05 | 0.496 |
Manually implemented tupple | 0.09 | 0.831 | 9.136 |
Automatic tupple | 0.23 | 1.402 | 15.03 |
Some additional cases were run over the automatic tupple to determine executions per second. An Excel with a linear regression to determine the following values is also available.
Function | Million Times per second |
Equals | 85.4 |
GetHashCode | 67.53 |
Conclusions
Every time I find a repetitive task, I try to make it automatic. Reflection, Emit, and now Linq.Expressions are amazingly helpful for doing so. This small package is part of a library I'm finishing these days. Hope you'd find this useful.
Useful Links