Modification of .NET Expressions

Ivan Yakimov

5.00/5 (4 votes)

4 Dec 2014CPOL9 min read

19.2K

In this article, I'll show you how to modify expressions which are used to create Entity Framework queries.

Download source code - 16.5 KB

Introduction

I think many of us write simple Web API REST services the following way: use parameters of service method to construct Entity Framework LINQ query and return results. But sometimes, we do not return EF entity objects directly, but rather repack them into other objects:

public PersonInfo[] GetPersons()
{
    return _dbContext.Persons.Select(p => new PersonInfo { Id = p.PersonId, Name = p.Name }).ToArray();
}

The reasons for repacking can be different:

May be you want to send to consumer only part of properties of EF entity objects
May be you construct objects for consumer from several EF entity objects
May be you want to arrange some properties of EF entity objects into groups represented by some other classes
Etc.

In any circumstances, finally you can face the problem of filtering. Your method can return hundreds or thousands of objects but consumers may want only part of them satisfying some criteria. How to implement this filtering?

Certainly, there may be simple cases. If consumers only want to filter persons by name, you may write something like this:

public PersonInfo[] GetPersons(string startOfName)
{
    return _dbContext.Persons.Where(p => p.Name.StartsWith(startOfName))
        .Select(p => new PersonInfo { Id = p.PersonId, Name = p.Name }).ToArray();
}

But in general, there may be use cases the consumer wants to filter data from your service by any property of PersonInfo class. Moreover, filter conditions can be quite various. Let's say we have Age property of PersonInfo class. In this case, here are some variants of filters:

personInfo.Age = 30
personInfo.Age < 50
personInfo.Age >= 20 && personInfo.Age < 60

How to support all these variants? One way is to create a special class of filter:

public class FilterCondition
{
    public string PropertyName { get; set; }
    public OperationType Operation { get; set; }
    public object Value { get; set; }
}

public enum OperationType
{
    Equals,
    Less,
    Greater,
}

In this case, consumer can call our service like this:

service.GetPersons(new []{ 
        new FilterCondition { PropertyName = "Age", Operation = OperationType.Less, Value = 50 } 
    });

And the service method will look like this:

public PersonInfo[] GetPersons(FilterCondition[] conditions)
{
    var persons = _dbContext.Persons;

    foreach(var condition in conditions)
    {
         var value = condition.Value;
         if(condition.PropertyName == "Age")
         {
             switch(condition.Operation)
             {
                 case OperationType.Less:
                     persons = persons.Where(p => p.Age < (int)value);
                     break;
                 ....
             }
         }
         ....
    }
   
    return persons.Select(p => new PersonInfo { Id = p.PersonId, Name = p.Name }).ToArray();
}

In my opinion, this implementation has several serious drawbacks:

First of all, the implementation of service method GetPersons is too complex. We should write separate cases for every property we like to filter on. Certainly, sometimes we can improve the situation using Reflection. But there are complex cases. For example, property Location.Town of PersonInfo class can be filled from property Address.City of Person EF entity class.
If we want to add filtering on some new property, we have to change GetPersons method.
List of operations in OperationType enum can't be complete. There always will be cases when consumer wants to apply another type of filter. In this case, we will have to change GetPersons method to support new operation.
For consumer, it is not very convenient to create a list of FilterCondition objects.
It is not easy to rename properties of Person and PersonInfo classes. They are bound with values of PropertyName property of FilterCondition class which is just a string. So one will have to manually check if PropertyName properties contain correct values after renaming.

Solution

So what is a desired solution? What if consumer could write any filtering expressions like this:

var persons = service.GetPersons(p => p.Name.StartsWith("D") && p.Age < 30);

I think this is a very convenient approach. Consumer can write any filter using any functions, compiler will check correctness of names of properties and their types, Visual Studio will automatically rename properties if needed.

But is this solution achievable? Well, partly.

As you probably know, EF methods like Where can accept Expression objects as parameters. These Expression objects represent object model of our code. Any function may be converted into Expression object:

Expression<Func<PersonInfo, bool>> expr = p => p.Name.StartsWith("D") && p.Age < 30;

We can define our GetPersons method like this:

public PersonInfo[] GetPersons(Expression<Func<PersonInfo, bool>> filter)

and use this filter expression inside.

But there are 2 major obstacles.

Obstacle 1: Transmission

If we have Web API REST service, then we should somehow transfer Expression object from consumer to service. Unfortunately, Expression class is not serializable. But there are third-party libraries to do it (expressiontree.codeplex.com). Also, to be able to deserialize this object on service side all constructs inside this expression should be known at service side. It means that consumer can't write their filters like this:

Expression<Func<PersonInfo, bool>> expr = p => p.Age < GetDesiredAge();

because service side does not know GetDesiredAge function. Instead, something like this should be used:

var age = GetDesiredAge();
Expression<Func<PersonInfo, bool>> expr = p => p.Age < age;

There are similar limitations for using classes in filter expressions.

Obstacle 2: Rewriting

There is one more obstacle which in my opinion is more important then previous. As you can see, we have Expression<Func<PersonInfo, bool>> object whereas Entity Framework needs Expression<Func<Person, bool>> object. And here expression rewriting comes into play. You can download the source code of the expression rewriter from this article. Further, we'll discuss how it works.

Implementation of Rewriting

Roughly speaking, we need some function that takes Expression<Func<PersonInfo, bool>> and returns Expression<Func<Person, bool>>. But to do it, we need some additional information. We need to know that Location.Town property of PersonInfo class should be replaced with Address.City property of Person class, etc.

var rewriter = new ExpressionRewriter();
rewriter.ChangeArgumentType<PersonInfo>().To<Person>();
rewriter.ChangeProperty<PersonInfo>(pi => pi.Status).To<Person>(p => p.FamilyStatus);
rewriter.ChangeProperty<PersonInfo>(pi => pi.Country).To<Person>(p => p.Address.Country);
rewriter.ChangeProperty<PersonInfo>(pi => pi.Location.Town).To<Person>(p => p.Address.City);

As you can see, this information can be provided using ChangeProperty method. Also method ChangeArgumentType says that PersonInfo class in arguments of the function represented by our expression should be replaced by Person class.

Ok, now we have all required information and may start rewriting of expression. In fact, term 'rewriting' is not strictly correct. All descendants of Expression class are read-only and cannot be modified. So we need to create a new expression based on an existing one. Fortunately, Microsoft provides ExpressionVisitor class that makes this work easier. You can inherit your expression changer from this class:

class RewritingVisitor : ExpressionVisitor

You pass your initial expression to the Visit method and it returns changed expression. If you don't override any of methods of ExpressionVisitor, then returned expression will be the same. But we will.

Changing Type of Filtering Function

As it was said on the input, we have Expression<Func<PersonInfo, bool>> but we need Expression<Func<Person, bool>>. So Func<PersonInfo, bool> must be changed to Func<Person, bool>. This is done in override of VisitLambda method:

protected override Expression VisitLambda<T>(Expression<T> node)
{
    var body = Visit(node.Body);
    var parameters = VisitAndConvert(node.Parameters, "VisitLambda");

    if (body == node.Body && parameters == node.Parameters)
    { return node; }

    Type delegateType;

    var funcGenericTypes = new List<Type>(parameters.Select(p => p.Type));

    funcGenericTypes.Add(body.Type);

    var funcType = typeof(Func<>).Assembly.GetTypes()
            .Where(t => t.Name.StartsWith("Func`"))
            .Where(t => t.IsGenericType)
            .FirstOrDefault(t => t.GetGenericArguments().Length == funcGenericTypes.Count);

    if (funcType == null)
    {
        throw new InvalidOperationException("Can't find corresponding Func<> type");
    }

    delegateType = funcType.MakeGenericType(funcGenericTypes.ToArray());
    return Expression.Lambda(delegateType, body, parameters);
}

First of all, we rewrite body and parameters of the method. If during the rewriting, they were not changed then we return initial expression. But if they were changed, then we create new delegate type delegateType for our rewritten expression and return new expression for this new function.

Here, potentially, we can have a problem. We assume that input expression has type Expression<Func<...>>. But in general, it can be Expression<AnyDelegate>. It is hard to understand what should be done with AnyDelegate. So for simplicity and for limited usage as a filter expression in Entity Framework, we'll stick to our previous assumption.

So how do we change parameters of our function?

Changing Types of Parameters

It is done using override of VisitParameter method.

protected override Expression VisitParameter(ParameterExpression node)
{
    if (_argumentTypeChanges.ContainsKey(node.Type))
    {
        if (_argumentSubstitutions.ContainsKey(node.Name))
        {
            return _argumentSubstitutions[node.Name];
        }
                
        var substitutionParameter = Expression.Parameter(_argumentTypeChanges[node.Type], node.Name);
        _argumentSubstitutions[node.Name] = substitutionParameter;
        return substitutionParameter;
    }

    return base.VisitParameter(node);
}

Here, we have dictionary _argumentTypeChanges containing initial types of arguments and types of their substitutions. This dictionary is built using ChangeArgumentType method. When type of parameter is in this dictionary then we must create a new parameter expression with different type. The problem is that the same parameter may be used in many places of the body of the method. And in all these places, there must be the same instance of ParameterExpression object. For this reason, when we create a new instance of ParameterExpression object, we cache it in the _argumentSubstitutions dictionary and take them from this cache next time we need parameter with this name.

The only thing left to do is to rewrite sequences of properties (.Location.Town -> .Address.City)

Rewriting Sequences of Properties

This part of work is done in an override of VisitMember method:

protected override Expression VisitMember(MemberExpression node)

The implementation of this method consists of two parts. First of them tries to rewrite sequence of properties using information gathered by calls of ChangeProperty methods:

var propertiesChange = _propertiesChanges.FirstOrDefault(pc => pc.SourceCorrespondsTo(node));
if (propertiesChange != null)
{
    Expression sequenceOrigin = propertiesChange.GetSequenceOriginExpression(node);
    Expression newSequenceOrigin = Visit(sequenceOrigin);
    return propertiesChange.GetNewPropertiesSequence(newSequenceOrigin);
}

First of all, we try to understand what sequence of properties we should rewrite. If this sequence is found (propertiesChange is not null), we process it. For example, we have function:

personInfo => personInfo.Location.Town.StartsWith("L")

We want to replace sequence of properties .Location.Town with .Address.City. We start by finding the origin of this sequence (sequenceOrigin) that is expression to which this sequence is applied. In this case, it is parameter personInfo. Then, we process this origin using Visit method (in our case, type of personInfo will be changed from PersonInfo to Person). And then, we construct new sequence of properties based on new origin (GetNewPropertiesSequence).

We'll talk about implementation of propertiesChange objects later. And now, let's consider the second part of VisitMember method. Let's say both PersonInfo and Person classes have several equivalent properties (e.g. Name, Age, ...) containing the same information. It is not very convenient to write for every such property:

rewriter.ChangeProperty<PersonInfo>(pi => pi.Name).To<Person>(p => p.Name);

It is much better to process such properties by default. This is what the second part of VisitMember method does.

Expression expression = Visit(node.Expression);
if (expression == node.Expression)
{
    return node;
}

if (expression.Type == node.Member.DeclaringType)
{
    return Expression.MakeMemberAccess(expression, node.Member);
}

var newMember = expression.Type.GetMember(node.Member.Name)
    .FirstOrDefault(m => m.MemberType == MemberTypes.Property || m.MemberType == MemberTypes.Field);
if (newMember == null)
{
    throw new InvalidOperationException(string.Format("Type '{0}' does not contain field 
                          or property '{1}'", expression.Type, node.Member.Name));
}

return Expression.MakeMemberAccess(expression, newMember);

MemberExpression object which is processed in VisitMember method contains System.Reflection.MemberInfo instance to indicate what member should be called. Previously, this MemberInfo instance said that we called let's say Name property of PersonInfo class. Now we should call Name property of Person class. This is what this code does. It creates new MemberInfo instance (newMember) and new MemberExpression instance based on it.

Replacing Sequences of Properties

And last but not least, let's consider how we can replace one sequence of properties (.Location.Town) with another (.Address.City). Everything starts with call of ChangeProperty:

rewriter.ChangeProperty<PersonInfo>(pi => pi.Location.Town).To<Person>(p => p.Address.City);

At this moment, two objects of type PropertiesSequence are created: one (source) for pi => pi.Location.Town and one (target) for p => p.Address.City. Each PropertiesSequence object contains list of properties and type of their origin (object to which this sequence is applied). So for source object:

properties = [ "Town", "Location" ]
originType = PersonInfo // type of pi

and for target object:

properties = [ "Cities", "Address" ]
originType = Person // type of p

These two objects of PropertiesSequence type are packed into PropertiesChange object. This object has several helper methods that do required work.

First of all, SourceCorrespondsTo method checks if expression passed as a parameter of this method indeed corresponds to the sequence of properties in source object. It means that expression represents the same sequence of calls of the same properties and their origin has the same type as origin type stored in source object.

public bool SourceCorrespondsTo(Expression expression)
{
    expression = GetSequenceOriginExpression(expression);

    return expression != null && expression.Type == _source.SequenceOriginType;
}

public Expression GetSequenceOriginExpression(Expression expression)
{
    foreach (var propertyInfo in _source.Properties)
    {
        var memberExpression = expression as MemberExpression;
        if (memberExpression == null)
        { return null; }

        if (memberExpression.Member.Name != propertyInfo.Name)
        { return null; }
        if (memberExpression.Type != propertyInfo.ResultType)
        { return null; }

        expression = memberExpression.Expression;
    }

    return expression;
}

Method GetSequenceOriginExpression here returns origin of the sequence of properties if it corresponds to source object and null otherwise.

Let me remind you how these methods are used in VisitMember method of RewritingVisitor class:

var propertiesChange = _propertiesChanges.FirstOrDefault(pc => pc.SourceCorrespondsTo(node));
if (propertiesChange != null)
{
    Expression sequenceOrigin = propertiesChange.GetSequenceOriginExpression(node);
    Expression newSequenceOrigin = Visit(sequenceOrigin);
    return propertiesChange.GetNewPropertiesSequence(newSequenceOrigin);
}

We find PropertiesChange object which source sequence of properties corresponds to the current expression. Then we get origin of this sequence from current expression and rewrite it using call of Visit(sequenceOrigin). Finally, we construct new sequence of properties using GetNewPropertiesSequence method:

public Expression GetNewPropertiesSequence(Expression sequenceOrigin)
{
    if (sequenceOrigin == null) throw new ArgumentNullException("sequenceOrigin");

    if(sequenceOrigin.Type != _target.SequenceOriginType)
        throw new ArgumentException("Type of rewritten properties sequence is incorrect.", 
                                    "sequenceOrigin");

    foreach (var propertyInfo in _target.Properties.Reverse())
    {
        var memberInfo = sequenceOrigin.Type.GetMember(propertyInfo.Name)
            .FirstOrDefault(m => m.MemberType == MemberTypes.Field || 
                            m.MemberType == MemberTypes.Property);
        if(memberInfo == null)
            throw new InvalidOperationException("Unable to create rewritten properties sequence");

        sequenceOrigin = Expression.MakeMemberAccess(sequenceOrigin, memberInfo);
    }

    return sequenceOrigin;
}

First of all, it checks if type of new origin is equal to the type of origin stored in the target object. Then it reconstructs expression of the target sequence of properties based on this new origin.

Conclusion

Let me now summarize results of this article. The approach described here allows consumers of our services to write filters using well known syntax of their programming languages. They can use many standard functions and operators in their filters. Consumers can use any properties of classes returned by the service in their filters. Also, compiler checks correctness of code of the filters and renaming of properties is a safe operation.

But at the same time, there are a lot of problems here:

The described approach can be used if .NET client is provided for our service. Using it from JavaScript is not convenient (if possible).
There are problems with transferring filter expression from client to service.
Still configuration of expression rewriter is required on some changes of involved classes (Person and PersonInfo).
Some transformation of filter expression requiring changes of types of results are not implemented yet.

I consider the work described in this article as a proof of concept, not a ready solution. I hope it will help you and you'll be able to improve it and adjust it for your needs.

History

Revision	Date	Comment
1.0	04.12.2014	Initial revision

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)