Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Building Linq Expressions Dynamically

0.00/5 (No votes)
28 Nov 2010 1  
Engine in C# which generates Linq Expressions runtime based on simple scripts

Introduction

Well, the general idea is to give you the possibility to write something like this:

 Func<Order,bool> filterOrders = 
     XXX("not(Supplier.City<>\"London\" or Supplier.Status>15)");

And then use it in any expression which requires Func or Predicate.
Implementation of convertor from strings like "not(Supplier.City<>\"London\" or Supplier.Status>15)" to Func<Order, bool> is what this article about.

The idea to build such engine came when I was developing a desktop application for goods and customers management. There were a lot of artifacts like customer, good, order, etc., which were presented in table form for the user and he can manipulate them. The important part was that the user should have the possibility to setup custom filters, sort items according to his own criterias and add special checking conditions like if this good's shelf life is about end, then mark it by different color, etc.

When creating this application, I decided to write some universal engine which will allow me to construct predicates and sorting criterias from string expression. It was .NET 2.0 and no Linq, so I built my own string expression parser, and interpreter builder (I used Interpreter pattern). To access object's properties and functions, I used reflection and it all worked just great... With one exception: it was a bit slow because of reflection. I bit because I spent some time optimizing it.

Now with .NET 4.0 and Linq Expressions, I decided to refactor it and build Linq Expressions. Basically, what it changed is that I got rid of Interpreter Pattern (Linq Expression instead of it). And, what is more important: you can Compile expression, so performance is very good and no reflection.

Background

I use C# generic collections extension methods and Linq Expressions to demonstrate the result.

For engine itself, I use runtime Expressions construction and compilation (Expression.Lambda.Compile method). I use reflection during expression construction for search of object's properties and methods and types conversion. Of course I use generics.

All parsing of expression string and built of interpreter is hand made, no special knowledge required.

Under functor in this article I mean generic delegates: Func<T,TResult>, Func<T1,T2,TResult>, etc. The following example describes a simple use of functor:

Func<int,bool> func = a => a + 1 == 7;
...
int arg0 = 6;
...
bool b=func(arg0); //b=true!!!

Now you see that to use functor, you have to pass some argument to it. In C#, Func is declared for up to 4 arguments. Functor refers to passed arguments to calculate result.

Problem Definition

Finally, problem definition sounds like this:

Construct functors Func<...,TResult> from string expression satisfying some grammar.

Solution

My solution to the problem described above is library, which allows creation of Linq Expressions and Functors from string expression.

One of the ideas was to keep expression grammar user friendly, so that user shouldn't think about types of variables, constants, etc. However, for developers expressions provide possibility to indicate argument, make type conversions, etc. With all this, I do not pretend to build full-scale compiler, so I omit such operators as ?: or ??. No "if-then-else" or looping is possible. Oh yes, Nullable types are supported!

Approximate grammar (similar to C#, but not 100%):

  1. const = any valid constant expression, strings can be taken in ""
  2. argument = arg0, arg1, ... argN //case insensitive
  3. bin_operator = +|-|*|/|=|<|<=|<>|>|>=|and|or|xor //case incensitive
  4. un_operator = not //case incensitive
  5. type_convertor = bool, byte, char, short, ushort, int, uint, long, ulong, float, double, decimal, string, DateTime //case insensitive
  6. type_conversion = type_convertor(expression)
  7. bin_operation expression bin_operator expression
  8. un_operation = un_operator expression
  9. property = [arg0].PropName | argument.PropName | (expression).PropName
  10. arg_list = expression, expression, ... , expression
  11. function_call = expression.FuncName(arg_list)
  12. expression = const | argument | property | function_call | bin_operation | un_operation

Generation is done in several steps:

  1. String expression is parsed by ExpressionParser class and list of Tokens created
  2. Tokens are analized by ExpressionBuilder class and resulting Linq Expression is created
  3. Linq Lambda function is generated from expression and compiled to Func<>

Problems Faced

Along with building parser, which is fairly simple, and compiler (builder) which is much more complex, I faced several problems described below.

Types (solved partially)

To build Linq Expression, I need to supply specific types of arguments to it. For example, when you call Expression.And(Expression arg1,Expression arg2) method, arg1 and arg2 should be of type bool.

For Expression.Add(Expression arg1,Expression arg2) method, arg1 and arg2 can be of any type, but should have + operator defined.

When operand obtained from Property or Function and is not base type I cannot do anything with it, so just throw exception (and I think it is correct handing). However, when there is a constant (especially when constants are from both sides of operator like 3 >= 4.5) I should guess what type to select.

I don't like the current solution, but looks like it works for most cases: I just probe every basic type starting from bool, int, float, etc. for both operands. In the mentioned case, the engine decided to use float as 4.5 is float while 3 is int and can be converted to float.

Function Calls (solved partially)

When expression contains function call, arguments for function are passed in (). Engine checks number of arguments and argument types and if anything does not match, throws exception. However, currently it doesn't take into consideration functions overloading and as a result valid expression construction may fail. This is because arguments types are not analyzed before MethodInfo for function obtained. It is partially related to the problem with Types described above.

Finally it can be solved, but requires more work.

Using the Code

Project is in .NET 4.0 and C#, but can be compiled in .NET 3.0 with small corrections (default parameters in some constructors you'll need to replace by second constructor).

For all samples in the demo project, I use several classes related to each other: Project, Detail, Supplier and Order. Basically Order entity describes number of specified Details ordered from specified Supplier and Project.

  1. Supplier: attributes Name, Status and City
  2. Detail: attributes Name, Weight and City
  3. Project: attributes Name and City
  4. Order: attributes Supplier, Detail, Project and Quantity
Func<Order,bool> londonFilter = 
   ExpressionBuilder.BuildFunctor<Order,bool>(
   "Project.City=\"London\" and Detail.City=Project.City");
Func<Order,string> ordStrConv = 
   ExpressionBuilder.BuildFunctor<Order,string>("Detail.Color.ToString()");
List<string> privelegeSuppliers = 
   new List<string>(orders.Where(londonFilter).Select(ordStrConv).Distinct());

Func<Supplier, Supplier, int> suppsorter = 
   ExpressionBuilder.BuildFunctor<Supplier, Supplier, int>("Status-arg1.Status");
suppliers.Sort(suppsorter.Invoke);

In RequestEngineTest project, you can find more interesting examples of usage.

Points of Interest

I definitely learned much more about Linq and Expressions while creating this library. From the beginning, the final solution was not supposed to be so integrated with built in Expressions and collections extensions. I was actually surprised to see that it integrates well.

Side Track

There is a class ListSegment<T> in the library, which I think could be useful to refactor and include to generic collections library. It is analogue of ArraySegment<T>, but much more useful.

In general, what it does is transparently wraps List<T> to provide access to only part of its elements. As it was not my main purpose, I haven't finished it and there's still work to do.

What is Different?

You may ask why not to use some script engine like Jint?

The answer is: each tool is best for its purpose. My purpose was to have user friendly grammar to input own criterias for sorting, filtering, etc. I think I got it.

Performance is the second answer. Because Expression is finally compiled to IL code and native code, its performance is much better than that of scripting engine.

History & Future

I used the previous version of ExpressionsGenerator (built on Interpreter pattern and Reflection) quite successfully more than 1 year ago and plan to use this new version in future.

I plan to support some more things like:

  1. Static Method calls
  2. Improved Function call
  3. Maybe add if-then-else somehow
  4. Add building of IComparer<T> by string expression, so that it was possible to use custom comparers for sorting (was in previous version)
  5. Add Action<T> support
  6. Refactor: separate ExpressionsBuilder from Expression by using AbstractFactory pattern (will allow to build other constructs based on the same grammar)
  7. Refactor: Add possibility to add new operators easier
  8. Review work with types

Oh yeah, it's a lot of work, but since I plan to use it maybe will do step by step.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here