Introduction
Well, the general idea is to give you the possibility to write something like this:
Func<Order,bool> filterOrders =
XXX("not(Supplier.City<>\"London\" or Supplier.Status>15)");
And then use it in any expression which requires Func
or Predicate
.
Implementation of convertor from string
s like "not(Supplier.City<>\"London\" or Supplier.Status>15)
" to Func<Order, bool>
is what this article about.
The idea to build such engine came when I was developing a desktop application for goods and customers management. There were a lot of artifacts like customer, good, order, etc., which were presented in table form for the user and he can manipulate them. The important part was that the user should have the possibility to setup custom filters, sort items according to his own criterias and add special checking conditions like if this good's shelf life is about end, then mark it by different color, etc.
When creating this application, I decided to write some universal engine which will allow me to construct predicates and sorting criterias from string
expression. It was .NET 2.0 and no Linq, so I built my own string
expression parser, and interpreter builder (I used Interpreter pattern). To access object's properties and functions, I used reflection and it all worked just great... With one exception: it was a bit slow because of reflection. I bit because I spent some time optimizing it.
Now with .NET 4.0 and Linq Expressions, I decided to refactor it and build Linq Expressions. Basically, what it changed is that I got rid of Interpreter Pattern (Linq Expression instead of it). And, what is more important: you can Compile expression, so performance is very good and no reflection.
Background
I use C# generic collections extension methods and Linq Expressions to demonstrate the result.
For engine itself, I use runtime Expressions construction and compilation (Expression.Lambda.Compile
method). I use reflection during expression construction for search of object's properties and methods and types conversion. Of course I use generics.
All parsing of expression string and built of interpreter is hand made, no special knowledge required.
Under functor in this article I mean generic delegates: Func<T,TResult>
, Func<T1,T2,TResult>
, etc. The following example describes a simple use of functor:
Func<int,bool> func = a => a + 1 == 7;
...
int arg0 = 6;
...
bool b=func(arg0);
Now you see that to use functor, you have to pass some argument to it. In C#, Func
is declared for up to 4 arguments. Functor refers to passed arguments to calculate result.
Problem Definition
Finally, problem definition sounds like this:
Construct functors Func<...,TResult> from string expression satisfying some grammar.
Solution
My solution to the problem described above is library, which allows creation of Linq Expressions and Functors from string
expression.
One of the ideas was to keep expression grammar user friendly, so that user shouldn't think about types of variables, constants, etc. However, for developers expressions provide possibility to indicate argument, make type conversions, etc. With all this, I do not pretend to build full-scale compiler, so I omit such operators as ?: or ??. No "if
-then
-else
" or looping is possible. Oh yes, Nullable types are supported!
Approximate grammar (similar to C#, but not 100%):
const
= any valid constant expression, string
s can be taken in ""
argument
= arg0
, arg1
, ... argN
//case insensitive
bin_operator
= +|-|*|/|=|<|<=|<>|>|>=|and|or|xor //case incensitive
un_operator
= not //case incensitive
type_convertor
= bool
, byte
, cha
r, short
, ushort
, int
, uint
, long
, ulong
, float
, double
, decimal
, string
, DateTime
//case insensitive
type_conversion
= type_convertor
(expression
)
bin_operation
= expression bin_operator expression
un_operation
= un_operator expression
property
= [arg0].PropName
| argument.PropName
| (expression).PropName
arg_list
= expression
, expression
, ... , expression
function_call
= expression.FuncName(arg_list)
expression
= const
| argument
| property
| function_call
| bin_operation
| un_operation
Generation is done in several steps:
String
expression is parsed by ExpressionParser
class and list of Tokens created
- Tokens are analized by
ExpressionBuilder
class and resulting Linq Expression is created
- Linq Lambda function is generated from expression and compiled to
Func<>
Problems Faced
Along with building parser, which is fairly simple, and compiler (builder) which is much more complex, I faced several problems described below.
Types (solved partially)
To build Linq Expression, I need to supply specific types of arguments to it. For example, when you call Expression.And(Expression arg1,Expression arg2)
method, arg1
and arg2
should be of type bool
.
For Expression.Add(Expression arg1,Expression arg2)
method, arg1
and arg2
can be of any type, but should have + operator defined.
When operand obtained from Property
or Function
and is not base type I cannot do anything with it, so just throw exception (and I think it is correct handing). However, when there is a constant (especially when constants are from both sides of operator like 3 >= 4.5) I should guess what type to select.
I don't like the current solution, but looks like it works for most cases: I just probe every basic type starting from bool
, int
, float
, etc. for both operands. In the mentioned case, the engine decided to use float
as 4.5 is float
while 3 is int
and can be converted to float
.
Function Calls (solved partially)
When expression contains function call, arguments for function are passed in ()
. Engine checks number of arguments and argument types and if anything does not match, throws exception. However, currently it doesn't take into consideration functions overloading and as a result valid expression construction may fail. This is because arguments types are not analyzed before MethodInfo
for function obtained. It is partially related to the problem with Types described above.
Finally it can be solved, but requires more work.
Using the Code
Project is in .NET 4.0 and C#, but can be compiled in .NET 3.0 with small corrections (default parameters in some constructors you'll need to replace by second constructor).
For all samples in the demo project, I use several classes related to each other: Project
, Detail
, Supplier
and Order
. Basically Order
entity describes number of specified Details
ordered from specified Supplier
and Project
.
Supplier
: attributes Name
, Status
and City
Detail
: attributes Name
, Weight
and City
Project
: attributes Name
and City
Order
: attributes Supplier
, Detail
, Project
and Quantity
Func<Order,bool> londonFilter =
ExpressionBuilder.BuildFunctor<Order,bool>(
"Project.City=\"London\" and Detail.City=Project.City");
Func<Order,string> ordStrConv =
ExpressionBuilder.BuildFunctor<Order,string>("Detail.Color.ToString()");
List<string> privelegeSuppliers =
new List<string>(orders.Where(londonFilter).Select(ordStrConv).Distinct());
Func<Supplier, Supplier, int> suppsorter =
ExpressionBuilder.BuildFunctor<Supplier, Supplier, int>("Status-arg1.Status");
suppliers.Sort(suppsorter.Invoke);
In RequestEngineTest
project, you can find more interesting examples of usage.
Points of Interest
I definitely learned much more about Linq and Expressions while creating this library. From the beginning, the final solution was not supposed to be so integrated with built in Expressions and collections extensions. I was actually surprised to see that it integrates well.
Side Track
There is a class ListSegment<T>
in the library, which I think could be useful to refactor and include to generic collections library. It is analogue of ArraySegment<T>
, but much more useful.
In general, what it does is transparently wraps List<T>
to provide access to only part of its elements. As it was not my main purpose, I haven't finished it and there's still work to do.
What is Different?
You may ask why not to use some script engine like Jint?
The answer is: each tool is best for its purpose. My purpose was to have user friendly grammar to input own criterias for sorting, filtering, etc. I think I got it.
Performance is the second answer. Because Expression is finally compiled to IL code and native code, its performance is much better than that of scripting engine.
History & Future
I used the previous version of ExpressionsGenerator
(built on Interpreter pattern and Reflection) quite successfully more than 1 year ago and plan to use this new version in future.
I plan to support some more things like:
Static
Method calls
- Improved Function call
- Maybe add
if
-then
-else
somehow
- Add building of
IComparer<T>
by string
expression, so that it was possible to use custom comparers for sorting (was in previous version)
- Add
Action<T>
support
- Refactor: separate
ExpressionsBuilder
from Expression
by using AbstractFactory
pattern (will allow to build other constructs based on the same grammar)
- Refactor: Add possibility to add new operators easier
- Review work with types
Oh yeah, it's a lot of work, but since I plan to use it maybe will do step by step.