Introduction
The current implementation of .NET generics is used mainly to make type-safe collections faster and more easy to use. Doing calculations on generic types is not as straightforward.
The problem
An example for doing calculations on generic types would be a generic method to calculate the sum of all elements in a List<T>
. Of course, summing up all elements of a list only makes sense if T
is a type such as int
, double
, decimal
that defines an addition operation.
Somebody coming from a C++ background might implement such a method like this:
public class Lists {
...
public static T Sum(List<T> list)
{
T sum=0;
for(int i=0;i<list.Count;i++)
sum+=list[i];
return sum;
}
...
}
This is not possible in C# because unconstrained type parameters are assumed to be of type System.Object
, which does not define a +
operation.
To constrain type parameters in C#/.NET, you specify interfaces that the type has to implement. The problem is that interfaces may not contain any static methods, and operator methods are static methods.
So with the current constraint system, it is not possible to define operator constraints.
A clean way to enable numerical computations would be to let the basic data types like int
, float
, double
, decimal
etc. implement an interface for arithmetic operations. Then this interface could be used to constrain the type parameters. This would work similar to the IComparable<T>
interface that all basic data types implement.
I tried to convince the people at Microsoft to implement such an interface, but apparently they won't be able to do it in time for Whidbey.
We are on our own
Many people have been thinking about this problem, among them Eric Gunnerson and even Anders Hejlsberg.
The solution proposed by Anders Hejlsberg was to have an abstract class Calculator<T>
that has to be specialized for each primitive type. The generic type would then use an instance of the appropriate calculator to do the calculations.
Here is the code (copied from Eric Gunnerson's Blog):
First define the abstract base class:
public abstract class Calculator<T>
{
public abstract T Add(T a, T b);
}
Then specialize for the types you want to perform calculations on:
namespace Int32
{
public class Calculator: Calculator<int>
{
public override int Add(int a, int b)
{
return a + b;
}
}
}
Then use an appropriate Calculator<T>
to do the calculations. Here is an example that calculates the sum of all elements in a List<T>
.
class AlgorithmLibrary<T> where T: new()
{
Calculator<T> calculator;
public AlgorithmLibrary(Calculator<T> calculator)
{
this.calculator = calculator;
}
public T Sum(List<T> items)
{
T sum = new T();
for (int i = 0; i < items.Count; i++)
{
sum = calculator.Add(sum, items[i]);
}
return sum;
}
}
You would use it like this:
AlgorithmLibrary library = new AlgorithmLibrary<int>(new Int32.Calculator());
There are many other creative solutions, but all of them have the drawback that they involve some kind of dynamic method invocation (virtual methods, interfaces or delegates). So while they make it possible to perform calculations with generic type parameters, the performance is unacceptably low for numerical applications.
The Solution
The first person to come up with a solution that does not involve any virtual method invocation was Jeroen Frijters. The solution uses the fact that constraining type parameters using interfaces is not the same as casting to interfaces. Calling a method using an interface has the overhead of dynamic method dispatch, but calling a method on a type parameter that is constrained by an interface has no such overhead.
Here is an example. It is a generic method that sorts two numbers. The type T
has to implement IComparable<T>
so that we can be sure that it has the CompareTo
method. But nevertheless, calling the CompareTo
method does not have the overhead normally associated with interfaces.
public class Sorter
{
private static void Swap<T>(ref T a, ref T b)
{
T t=a;a=b;b=t;
}
public static void Sort<T>(ref T a,ref T b)
where T:IComparable<T>
{
if(a.CompareTo(b)>0)
Swap(ref a,ref b);
}
}
The approach suggested by Jeroen Frijters uses a second type parameter. To avoid unnecessary object creation and virtual method dispatch, it is best to use a value type for the type containing the operations. Here is a small example for this approach:
interface ICalculator<T>
{
T Sum(T a,T b);
}
struct IntCalculator : ICalculator<int>
{
public int Add(int a,int b) { return a+b; }
}
class Lists<T,C>
where T:new()
where C:ICalculator<T>,new();{
private static C calculator=new C();
public static T Sum(List<T> list)
{
T sum=new T();
for(int i=0;i<list.Count;i++)
sum=calculator.Add(sum,list[i]);
return sum;
}
}
The use of this class is a bit awkward because of the second type parameter. But you could use an alias. Here is how you would use the Lists<T,C>
type:
using IntLists=Lists<int,IntCalculator>;
List<int> list=new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
Console.WriteLine("The sum of all elements is {0}",
IntLists.Sum(list));
Performance
Since we made sure that our code does not contain any virtual method calls or unnecessary object creation, the generic code should be exactly as fast as non-generic code. To test this, I wrote a small benchmark. The result is encouraging. Even the current very limited JIT compiler manages to inline the Sum
method, so the generic version is indeed exactly as fast as the non-generic version on my machine.
Please make sure that you run the benchmark in release mode, since in debug mode many very important optimizations such as method inlining are disabled.
Using operators with type parameters
We now can do calculations on generic types with acceptable performance. But we still can not use operators with type parameters. In the above example, we had to write sum=calculator.Add(sum,list[i])
instead of simply writing sum+=list[i]
.
This might not be a big deal if you want to perform a simple addition. But if you have large, complex methods that excessively use operators, it would be a tedious and error-prone process to convert them to use method calls instead.
Fortunately, it is possible to create a wrapper struct
that defines operators. By using implicit conversions to and from the wrapped type, we can use this wrapper struct
almost transparently.
This is how a wrapper struct
for the above ICalculator
interface might look like:
public struct Number<T,C>
where C:ICalculator<T>,new()
{
private T value;
private static C calculator=new C();
private Number(T value) {
this.value=value;
}
public static implicit operator Number<T,C>(T a)
{
return new Number<T,C>(a);
}
public static implicit operator T(Number<T,C> a)
{
return a.value;
}
public static Number<T,C> operator + (Number<T,C> a, Number<T,C> b)
{
return calculator.Add(a.value,b.value);
}
...other operators...
}
Now, instead of using T
in your generic class, you would just use the wrapper struct
instead. Here is how the Sum
example would look like using the wrapper struct
:
class Lists<T,C>
where T:new()
where C:ICalculator<T>,new()
{
public static T Sum(List<T> list)
{
Number<T,C> sum=new T();
for(int i=0;i<list.Count;i++)
sum+=list[i];
return sum;
}
}
You might expect that this code runs exactly as fast as the non-operator version. But unfortunately, this is not the case. The current .NET JIT compiler has some very serious limitations. Basically, it does not do any inlining if a method has an explicit struct parameter (Yes, I was shocked too).
Since the operator methods in the wrapper struct
have struct
parameters, they will not be inlined, so the performance will suffer. But I am sure that Microsoft will fix this important performance issue soon.
Here is a suggestion about this topic that you can vote for. This is the only thing that is missing for lightning-fast and easy to use numerics in C#.
Conclusion
Even though the constraint syntax of .NET generics system is very limited, it is possible to work around those limits by using a second type parameter. This approach is a bit more work for the writer of the generic class library, but it is quite transparent for the user of the class library and it yields performance that is identical to a non-generic version. Just like with C++ templates, the runtime cost of the abstraction is zero.
There are some limitations, but these should go away very soon as Microsoft and the competing vendors improve the JIT compiler performance. So, it is possible to write numeric libraries that are as fast or even faster than compile-time template libraries such as the C++ STL.
About the code
The included project consists of interfaces, implementations and wrappers for all primitive types with a few nice additions. The list benchmark calculates the standard deviation of a large list both with generic and with non-generic methods. Another benchmark compares the performance of the non-generic System.Drawing.Point
and System.Drawing.PointF
struct
s with a generic Point<T>
version.
The code is under the BSD license, so feel free to use it for your own projects.
References
- Anders Hejlsberg proposed using an abstract base class.
- A solution using dynamic method generation.
- Jeroen Frijters proposed using value types for the calculator.
- This was my first posting about the topic. As you might notice, I was quite angry...
- Here is where I will publish newer versions of the code.
- Two articles by me on osnews about the topic.
- Please vote for this suggestion. It is more important than the presidential elections :-)