Introduction
I think many of us write simple Web API REST services the following way: use parameters of service method to construct Entity Framework LINQ query and return results. But sometimes, we do not return EF entity objects directly, but rather repack them into other objects:
public PersonInfo[] GetPersons()
{
return _dbContext.Persons.Select(p => new PersonInfo { Id = p.PersonId, Name = p.Name }).ToArray();
}
The reasons for repacking can be different:
- May be you want to send to consumer only part of properties of EF entity objects
- May be you construct objects for consumer from several EF entity objects
- May be you want to arrange some properties of EF entity objects into groups represented by some other classes
- Etc.
In any circumstances, finally you can face the problem of filtering. Your method can return hundreds or thousands of objects but consumers may want only part of them satisfying some criteria. How to implement this filtering?
Certainly, there may be simple cases. If consumers only want to filter persons by name, you may write something like this:
public PersonInfo[] GetPersons(string startOfName)
{
return _dbContext.Persons.Where(p => p.Name.StartsWith(startOfName))
.Select(p => new PersonInfo { Id = p.PersonId, Name = p.Name }).ToArray();
}
But in general, there may be use cases the consumer wants to filter data from your service by any property of PersonInfo
class. Moreover, filter conditions can be quite various. Let's say we have Age
property of PersonInfo
class. In this case, here are some variants of filters:
personInfo.Age = 30
personInfo.Age < 50
personInfo.Age >= 20 && personInfo.Age < 60
How to support all these variants? One way is to create a special class of filter:
public class FilterCondition
{
public string PropertyName { get; set; }
public OperationType Operation { get; set; }
public object Value { get; set; }
}
public enum OperationType
{
Equals,
Less,
Greater,
}
In this case, consumer can call our service like this:
service.GetPersons(new []{
new FilterCondition { PropertyName = "Age", Operation = OperationType.Less, Value = 50 }
});
And the service method will look like this:
public PersonInfo[] GetPersons(FilterCondition[] conditions)
{
var persons = _dbContext.Persons;
foreach(var condition in conditions)
{
var value = condition.Value;
if(condition.PropertyName == "Age")
{
switch(condition.Operation)
{
case OperationType.Less:
persons = persons.Where(p => p.Age < (int)value);
break;
....
}
}
....
}
return persons.Select(p => new PersonInfo { Id = p.PersonId, Name = p.Name }).ToArray();
}
In my opinion, this implementation has several serious drawbacks:
- First of all, the implementation of service method
GetPersons
is too complex. We should write separate cases for every property we like to filter on. Certainly, sometimes we can improve the situation using Reflection. But there are complex cases. For example, property Location.Town
of PersonInfo
class can be filled from property Address.City
of Person
EF entity class. - If we want to add filtering on some new property, we have to change
GetPersons
method. - List of operations in
OperationType enum
can't be complete. There always will be cases when consumer wants to apply another type of filter. In this case, we will have to change GetPersons
method to support new operation. - For consumer, it is not very convenient to create a list of
FilterCondition
objects. - It is not easy to rename properties of
Person
and PersonInfo
classes. They are bound with values of PropertyName
property of FilterCondition
class which is just a string
. So one will have to manually check if PropertyName
properties contain correct values after renaming.
Solution
So what is a desired solution? What if consumer could write any filtering expressions like this:
var persons = service.GetPersons(p => p.Name.StartsWith("D") && p.Age < 30);
I think this is a very convenient approach. Consumer can write any filter using any functions, compiler will check correctness of names of properties and their types, Visual Studio will automatically rename properties if needed.
But is this solution achievable? Well, partly.
As you probably know, EF methods like Where
can accept Expression
objects as parameters. These Expression
objects represent object model of our code. Any function may be converted into Expression
object:
Expression<Func<PersonInfo, bool>> expr = p => p.Name.StartsWith("D") && p.Age < 30;
We can define our GetPersons
method like this:
public PersonInfo[] GetPersons(Expression<Func<PersonInfo, bool>> filter)
and use this filter
expression inside.
But there are 2 major obstacles.
Obstacle 1: Transmission
If we have Web API REST service, then we should somehow transfer Expression
object from consumer to service. Unfortunately, Expression
class is not serializable. But there are third-party libraries to do it (expressiontree.codeplex.com). Also, to be able to deserialize this object on service side all constructs inside this expression should be known at service side. It means that consumer can't write their filters like this:
Expression<Func<PersonInfo, bool>> expr = p => p.Age < GetDesiredAge();
because service side does not know GetDesiredAge
function. Instead, something like this should be used:
var age = GetDesiredAge();
Expression<Func<PersonInfo, bool>> expr = p => p.Age < age;
There are similar limitations for using classes in filter expressions.
Obstacle 2: Rewriting
There is one more obstacle which in my opinion is more important then previous. As you can see, we have Expression<Func<PersonInfo, bool>>
object whereas Entity Framework needs Expression<Func<Person, bool>>
object. And here expression rewriting comes into play. You can download the source code of the expression rewriter from this article. Further, we'll discuss how it works.
Implementation of Rewriting
Roughly speaking, we need some function that takes Expression<Func<PersonInfo, bool>>
and returns Expression<Func<Person, bool>>
. But to do it, we need some additional information. We need to know that Location.Town
property of PersonInfo
class should be replaced with Address.City
property of Person
class, etc.
var rewriter = new ExpressionRewriter();
rewriter.ChangeArgumentType<PersonInfo>().To<Person>();
rewriter.ChangeProperty<PersonInfo>(pi => pi.Status).To<Person>(p => p.FamilyStatus);
rewriter.ChangeProperty<PersonInfo>(pi => pi.Country).To<Person>(p => p.Address.Country);
rewriter.ChangeProperty<PersonInfo>(pi => pi.Location.Town).To<Person>(p => p.Address.City);
As you can see, this information can be provided using ChangeProperty
method. Also method ChangeArgumentType
says that PersonInfo
class in arguments of the function represented by our expression should be replaced by Person
class.
Ok, now we have all required information and may start rewriting of expression. In fact, term 'rewriting' is not strictly correct. All descendants of Expression
class are read-only and cannot be modified. So we need to create a new expression based on an existing one. Fortunately, Microsoft provides ExpressionVisitor
class that makes this work easier. You can inherit your expression changer from this class:
class RewritingVisitor : ExpressionVisitor
You pass your initial expression to the Visit
method and it returns changed expression. If you don't override any of methods of ExpressionVisitor
, then returned expression will be the same. But we will.
Changing Type of Filtering Function
As it was said on the input, we have Expression<Func<PersonInfo, bool>>
but we need Expression<Func<Person, bool>>
. So Func<PersonInfo, bool>
must be changed to Func<Person, bool>
. This is done in override of VisitLambda
method:
protected override Expression VisitLambda<T>(Expression<T> node)
{
var body = Visit(node.Body);
var parameters = VisitAndConvert(node.Parameters, "VisitLambda");
if (body == node.Body && parameters == node.Parameters)
{ return node; }
Type delegateType;
var funcGenericTypes = new List<Type>(parameters.Select(p => p.Type));
funcGenericTypes.Add(body.Type);
var funcType = typeof(Func<>).Assembly.GetTypes()
.Where(t => t.Name.StartsWith("Func`"))
.Where(t => t.IsGenericType)
.FirstOrDefault(t => t.GetGenericArguments().Length == funcGenericTypes.Count);
if (funcType == null)
{
throw new InvalidOperationException("Can't find corresponding Func<> type");
}
delegateType = funcType.MakeGenericType(funcGenericTypes.ToArray());
return Expression.Lambda(delegateType, body, parameters);
}
First of all, we rewrite body and parameters of the method. If during the rewriting, they were not changed then we return initial expression. But if they were changed, then we create new delegate type delegateType
for our rewritten expression and return new expression for this new function.
Here, potentially, we can have a problem. We assume that input expression has type Expression<Func<...>>
. But in general, it can be Expression<AnyDelegate>
. It is hard to understand what should be done with AnyDelegate
. So for simplicity and for limited usage as a filter expression in Entity Framework, we'll stick to our previous assumption.
So how do we change parameters of our function?
Changing Types of Parameters
It is done using override of VisitParameter
method.
protected override Expression VisitParameter(ParameterExpression node)
{
if (_argumentTypeChanges.ContainsKey(node.Type))
{
if (_argumentSubstitutions.ContainsKey(node.Name))
{
return _argumentSubstitutions[node.Name];
}
var substitutionParameter = Expression.Parameter(_argumentTypeChanges[node.Type], node.Name);
_argumentSubstitutions[node.Name] = substitutionParameter;
return substitutionParameter;
}
return base.VisitParameter(node);
}
Here, we have dictionary _argumentTypeChanges
containing initial types of arguments and types of their substitutions. This dictionary is built using ChangeArgumentType
method. When type of parameter is in this dictionary then we must create a new parameter expression with different type. The problem is that the same parameter may be used in many places of the body of the method. And in all these places, there must be the same instance of ParameterExpression
object. For this reason, when we create a new instance of ParameterExpression
object, we cache it in the _argumentSubstitutions
dictionary and take them from this cache next time we need parameter with this name.
The only thing left to do is to rewrite sequences of properties (.Location.Town -> .Address.City
)
Rewriting Sequences of Properties
This part of work is done in an override of VisitMember
method:
protected override Expression VisitMember(MemberExpression node)
The implementation of this method consists of two parts. First of them tries to rewrite sequence of properties using information gathered by calls of ChangeProperty
methods:
var propertiesChange = _propertiesChanges.FirstOrDefault(pc => pc.SourceCorrespondsTo(node));
if (propertiesChange != null)
{
Expression sequenceOrigin = propertiesChange.GetSequenceOriginExpression(node);
Expression newSequenceOrigin = Visit(sequenceOrigin);
return propertiesChange.GetNewPropertiesSequence(newSequenceOrigin);
}
First of all, we try to understand what sequence of properties we should rewrite. If this sequence is found (propertiesChange
is not null
), we process it. For example, we have function:
personInfo => personInfo.Location.Town.StartsWith("L")
We want to replace sequence of properties .Location.Town
with .Address.City
. We start by finding the origin of this sequence (sequenceOrigin
) that is expression to which this sequence is applied. In this case, it is parameter personInfo
. Then, we process this origin using Visit
method (in our case, type of personInfo
will be changed from PersonInfo
to Person
). And then, we construct new sequence of properties based on new origin (GetNewPropertiesSequence
).
We'll talk about implementation of propertiesChange
objects later. And now, let's consider the second part of VisitMember
method. Let's say both PersonInfo
and Person
classes have several equivalent properties (e.g. Name
, Age
, ...) containing the same information. It is not very convenient to write for every such property:
rewriter.ChangeProperty<PersonInfo>(pi => pi.Name).To<Person>(p => p.Name);
It is much better to process such properties by default. This is what the second part of VisitMember
method does.
Expression expression = Visit(node.Expression);
if (expression == node.Expression)
{
return node;
}
if (expression.Type == node.Member.DeclaringType)
{
return Expression.MakeMemberAccess(expression, node.Member);
}
var newMember = expression.Type.GetMember(node.Member.Name)
.FirstOrDefault(m => m.MemberType == MemberTypes.Property || m.MemberType == MemberTypes.Field);
if (newMember == null)
{
throw new InvalidOperationException(string.Format("Type '{0}' does not contain field
or property '{1}'", expression.Type, node.Member.Name));
}
return Expression.MakeMemberAccess(expression, newMember);
MemberExpression
object which is processed in VisitMember
method contains System.Reflection.MemberInfo
instance to indicate what member should be called. Previously, this MemberInfo
instance said that we called let's say Name
property of PersonInfo
class. Now we should call Name
property of Person
class. This is what this code does. It creates new MemberInfo
instance (newMember
) and new MemberExpression
instance based on it.
Replacing Sequences of Properties
And last but not least, let's consider how we can replace one sequence of properties (.Location.Town
) with another (.Address.City
). Everything starts with call of ChangeProperty
:
rewriter.ChangeProperty<PersonInfo>(pi => pi.Location.Town).To<Person>(p => p.Address.City);
At this moment, two objects of type PropertiesSequence
are created: one (source
) for pi => pi.Location.Town
and one (target
) for p => p.Address.City
. Each PropertiesSequence
object contains list of properties and type of their origin (object to which this sequence is applied). So for source object:
properties = [ "Town", "Location" ]
originType = PersonInfo
and for target object:
properties = [ "Cities", "Address" ]
originType = Person
These two objects of PropertiesSequence
type are packed into PropertiesChange
object. This object has several helper methods that do required work.
First of all, SourceCorrespondsTo
method checks if expression passed as a parameter of this method indeed corresponds to the sequence of properties in source
object. It means that expression represents the same sequence of calls of the same properties and their origin has the same type as origin type stored in source
object.
public bool SourceCorrespondsTo(Expression expression)
{
expression = GetSequenceOriginExpression(expression);
return expression != null && expression.Type == _source.SequenceOriginType;
}
public Expression GetSequenceOriginExpression(Expression expression)
{
foreach (var propertyInfo in _source.Properties)
{
var memberExpression = expression as MemberExpression;
if (memberExpression == null)
{ return null; }
if (memberExpression.Member.Name != propertyInfo.Name)
{ return null; }
if (memberExpression.Type != propertyInfo.ResultType)
{ return null; }
expression = memberExpression.Expression;
}
return expression;
}
Method GetSequenceOriginExpression
here returns origin of the sequence of properties if it corresponds to source
object and null
otherwise.
Let me remind you how these methods are used in VisitMember
method of RewritingVisitor
class:
var propertiesChange = _propertiesChanges.FirstOrDefault(pc => pc.SourceCorrespondsTo(node));
if (propertiesChange != null)
{
Expression sequenceOrigin = propertiesChange.GetSequenceOriginExpression(node);
Expression newSequenceOrigin = Visit(sequenceOrigin);
return propertiesChange.GetNewPropertiesSequence(newSequenceOrigin);
}
We find PropertiesChange
object which source sequence of properties corresponds to the current expression. Then we get origin of this sequence from current expression and rewrite it using call of Visit(sequenceOrigin)
. Finally, we construct new sequence of properties using GetNewPropertiesSequence
method:
public Expression GetNewPropertiesSequence(Expression sequenceOrigin)
{
if (sequenceOrigin == null) throw new ArgumentNullException("sequenceOrigin");
if(sequenceOrigin.Type != _target.SequenceOriginType)
throw new ArgumentException("Type of rewritten properties sequence is incorrect.",
"sequenceOrigin");
foreach (var propertyInfo in _target.Properties.Reverse())
{
var memberInfo = sequenceOrigin.Type.GetMember(propertyInfo.Name)
.FirstOrDefault(m => m.MemberType == MemberTypes.Field ||
m.MemberType == MemberTypes.Property);
if(memberInfo == null)
throw new InvalidOperationException("Unable to create rewritten properties sequence");
sequenceOrigin = Expression.MakeMemberAccess(sequenceOrigin, memberInfo);
}
return sequenceOrigin;
}
First of all, it checks if type of new origin is equal to the type of origin stored in the target
object. Then it reconstructs expression of the target sequence of properties based on this new origin.
Conclusion
Let me now summarize results of this article. The approach described here allows consumers of our services to write filters using well known syntax of their programming languages. They can use many standard functions and operators in their filters. Consumers can use any properties of classes returned by the service in their filters. Also, compiler checks correctness of code of the filters and renaming of properties is a safe operation.
But at the same time, there are a lot of problems here:
- The described approach can be used if .NET client is provided for our service. Using it from JavaScript is not convenient (if possible).
- There are problems with transferring filter expression from client to service.
- Still configuration of expression rewriter is required on some changes of involved classes (
Person
and PersonInfo
). - Some transformation of filter expression requiring changes of types of results are not implemented yet.
I consider the work described in this article as a proof of concept, not a ready solution. I hope it will help you and you'll be able to improve it and adjust it for your needs.
History
Revision | Date | Comment |
1.0 | 04.12.2014 | Initial revision |