(untagged)

Using generics for calculations

R�diger Klaehn

0.00/5 (No votes)

11 Oct 2004

Performing calculations with generic types is not as easy it seems. This article shows how to do it.

Download demo project - 32.8 Kb

Introduction

The current implementation of .NET generics is used mainly to make type-safe collections faster and more easy to use. Doing calculations on generic types is not as straightforward.

The problem

An example for doing calculations on generic types would be a generic method to calculate the sum of all elements in a List<T>. Of course, summing up all elements of a list only makes sense if T is a type such as int, double, decimal that defines an addition operation.

Somebody coming from a C++ background might implement such a method like this:

public class Lists {
    ...
    public static T Sum(List<T> list) 
    {
        T sum=0;
        for(int i=0;i<list.Count;i++)
            sum+=list[i];
        return sum;
    }
    ...
}

This is not possible in C# because unconstrained type parameters are assumed to be of type System.Object, which does not define a + operation.

To constrain type parameters in C#/.NET, you specify interfaces that the type has to implement. The problem is that interfaces may not contain any static methods, and operator methods are static methods.

So with the current constraint system, it is not possible to define operator constraints.

A clean way to enable numerical computations would be to let the basic data types like int, float, double, decimal etc. implement an interface for arithmetic operations. Then this interface could be used to constrain the type parameters. This would work similar to the IComparable<T> interface that all basic data types implement.

I tried to convince the people at Microsoft to implement such an interface, but apparently they won't be able to do it in time for Whidbey.

We are on our own

Many people have been thinking about this problem, among them Eric Gunnerson and even Anders Hejlsberg.

The solution proposed by Anders Hejlsberg was to have an abstract class Calculator<T> that has to be specialized for each primitive type. The generic type would then use an instance of the appropriate calculator to do the calculations.

Here is the code (copied from Eric Gunnerson's Blog):

First define the abstract base class:

public abstract class Calculator<T>
{
    public abstract T Add(T a, T b);
}

Then specialize for the types you want to perform calculations on:

namespace Int32
{
    public class Calculator: Calculator<int>
    {
        public override int Add(int a, int b)
        {
            return a + b;
        }
    } 
}

Then use an appropriate Calculator<T> to do the calculations. Here is an example that calculates the sum of all elements in a List<T>.

class AlgorithmLibrary<T> where T: new() 
{
    Calculator<T> calculator;

    public AlgorithmLibrary(Calculator<T> calculator)
    {
         this.calculator = calculator;
    } 

    public T Sum(List<T> items)
    {
        T sum = new T(); 

        for (int i = 0; i < items.Count; i++)
        {
            sum = calculator.Add(sum, items[i]);
        } 

        return sum;
    }
}

You would use it like this:

AlgorithmLibrary library = new AlgorithmLibrary<int>(new Int32.Calculator());

There are many other creative solutions, but all of them have the drawback that they involve some kind of dynamic method invocation (virtual methods, interfaces or delegates). So while they make it possible to perform calculations with generic type parameters, the performance is unacceptably low for numerical applications.

The Solution

The first person to come up with a solution that does not involve any virtual method invocation was Jeroen Frijters. The solution uses the fact that constraining type parameters using interfaces is not the same as casting to interfaces. Calling a method using an interface has the overhead of dynamic method dispatch, but calling a method on a type parameter that is constrained by an interface has no such overhead.

Here is an example. It is a generic method that sorts two numbers. The type T has to implement IComparable<T> so that we can be sure that it has the CompareTo method. But nevertheless, calling the CompareTo method does not have the overhead normally associated with interfaces.

public class Sorter
{
    private static void Swap<T>(ref T a, ref T b)
    {
        T t=a;a=b;b=t;
    }
    public static void Sort<T>(ref T a,ref T b)
        where T:IComparable<T>
    {
        if(a.CompareTo(b)>0)
            Swap(ref a,ref b);
    }
}

The approach suggested by Jeroen Frijters uses a second type parameter. To avoid unnecessary object creation and virtual method dispatch, it is best to use a value type for the type containing the operations. Here is a small example for this approach:

interface ICalculator<T> 

{ 
    T Sum(T a,T b); 

} 

struct IntCalculator : ICalculator<int> 

{ 
    public int Add(int a,int b) { return a+b; } 

} 

// struct FloatAdder ... 


// struct DoubleAdder ... 


class Lists<T,C>
    where T:new() 
    where C:ICalculator<T>,new();{
    //since C is a struct with zero size, we can create an instance

    //of C whenever we need one without any overhead.

    private static C calculator=new C();
    public static T Sum(List<T> list) 
    { 
        T sum=new T();
        for(int i=0;i<list.Count;i++)
            sum=calculator.Add(sum,list[i]); 
        return sum; 
    } 

}

The use of this class is a bit awkward because of the second type parameter. But you could use an alias. Here is how you would use the Lists<T,C> type:

using IntLists=Lists<int,IntCalculator>;
//...

List<int> list=new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
Console.WriteLine("The sum of all elements is {0}",
IntLists.Sum(list));

Performance

Since we made sure that our code does not contain any virtual method calls or unnecessary object creation, the generic code should be exactly as fast as non-generic code. To test this, I wrote a small benchmark. The result is encouraging. Even the current very limited JIT compiler manages to inline the Sum method, so the generic version is indeed exactly as fast as the non-generic version on my machine.

Please make sure that you run the benchmark in release mode, since in debug mode many very important optimizations such as method inlining are disabled.

Using operators with type parameters

We now can do calculations on generic types with acceptable performance. But we still can not use operators with type parameters. In the above example, we had to write sum=calculator.Add(sum,list[i]) instead of simply writing sum+=list[i].

This might not be a big deal if you want to perform a simple addition. But if you have large, complex methods that excessively use operators, it would be a tedious and error-prone process to convert them to use method calls instead.

Fortunately, it is possible to create a wrapper struct that defines operators. By using implicit conversions to and from the wrapped type, we can use this wrapper struct almost transparently.

This is how a wrapper struct for the above ICalculator interface might look like:

public struct Number<T,C>
    where C:ICalculator<T>,new()
{
    private T value;
    private static C calculator=new C();
    private Number(T value) {
        this.value=value;
    }
    public static implicit operator Number<T,C>(T a)
    {
        return new Number<T,C>(a);
    }
    public static implicit operator T(Number<T,C> a)
    {
        return a.value;
    }
    public static Number<T,C> operator + (Number<T,C> a, Number<T,C> b)
    {
        return calculator.Add(a.value,b.value);
    }
    ...other operators...
}

Now, instead of using T in your generic class, you would just use the wrapper struct instead. Here is how the Sum example would look like using the wrapper struct:

class Lists<T,C>
    where T:new()
    where C:ICalculator<T>,new()
{
    public static T Sum(List<T> list)
    {
        Number<T,C> sum=new T();
        for(int i=0;i<list.Count;i++)
            sum+=list[i];
        return sum;
    }
}

You might expect that this code runs exactly as fast as the non-operator version. But unfortunately, this is not the case. The current .NET JIT compiler has some very serious limitations. Basically, it does not do any inlining if a method has an explicit struct parameter (Yes, I was shocked too).

Since the operator methods in the wrapper struct have struct parameters, they will not be inlined, so the performance will suffer. But I am sure that Microsoft will fix this important performance issue soon.

Here is a suggestion about this topic that you can vote for. This is the only thing that is missing for lightning-fast and easy to use numerics in C#.

Conclusion

Even though the constraint syntax of .NET generics system is very limited, it is possible to work around those limits by using a second type parameter. This approach is a bit more work for the writer of the generic class library, but it is quite transparent for the user of the class library and it yields performance that is identical to a non-generic version. Just like with C++ templates, the runtime cost of the abstraction is zero.

There are some limitations, but these should go away very soon as Microsoft and the competing vendors improve the JIT compiler performance. So, it is possible to write numeric libraries that are as fast or even faster than compile-time template libraries such as the C++ STL.

About the code

The included project consists of interfaces, implementations and wrappers for all primitive types with a few nice additions. The list benchmark calculates the standard deviation of a large list both with generic and with non-generic methods. Another benchmark compares the performance of the non-generic System.Drawing.Point and System.Drawing.PointF structs with a generic Point<T> version.

The code is under the BSD license, so feel free to use it for your own projects.

References

Anders Hejlsberg proposed using an abstract base class.
A solution using dynamic method generation.
Jeroen Frijters proposed using value types for the calculator.
This was my first posting about the topic. As you might notice, I was quite angry...
Here is where I will publish newer versions of the code.
Two articles by me on osnews about the topic.
Please vote for this suggestion. It is more important than the presidential elections :-)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here