Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Expression Tree Traversal Via Visitor Pattern in Practice

0.00/5 (No votes)
15 May 2018 1  
How to deal with expression trees, presented at C# as generic Expression type, to achieve some practical goals, for example - translating them to specific language like T-SQL.

Problem

I will consider a very common problem - we need to translate Expression to some language not from .NET domain. For example, with Entity Framework, we can write code:

var order = context.Orders.Where(x => x.Customer == "Tom" && 
                  x.Amount > 1000 && x.TheDate == today).ToList();

It will return orders, which are satisfied presented condition. This predicate has type: Expression<Func<Order, bool>> and represents expression tree with structure like this:

                                           ____________  && ____________
                                         /                                               \
                              _____&&______                                 ____==___
                            /                       \                              /                \
                  ___==___                  ___>____             x.TheDate       today
                /               \               /             \           
      x.Customer     "Tom"    x.Amount    1000

Entity Framework will translate it to, for example, T-SQL syntax (depends on underlying provider):

"SELECT * FROM Orders WHERE Customer = 'Tome' and Amount > 1000 and TheDate = '2018-04-28'"

In this way, our goal is to figure out how it was made and perform custom translation to some another language for demonstrating purposes.

There is Azure Table Storage (https://docs.microsoft.com/en-us/azure/cosmos-db/table-storage-how-to-use-dotnet) cloud infrastructure. You can think about it as some kind of storage like database - tables with columns, rows and data. We also can specify filters, when fetching data from them. Rules (required syntax) are presented here. They are:

Comparison Operators Table
Logical operator Description Example filter string
eq Equal City eq 'Redmond'
gt Greater than Price gt 20
ge Greater than or equal to Price ge 10
lt Less than Price lt 20
le Less than or equal Price le 100
ne Not equal City ne 'London'
and And Price le 200 and Price gt 3.5
or Or Price le 3.5 or Price gt 200
not Not not isAvailable

String properties should be wrapped into single quotes: Name eq 'Tom'. Datetime values should be presented this way: BirtDate eq datetime'2008-07-10T00:00:00Z', all the rest are via ToString() call. So the full condition should be: Customer eq 'Tome' and Amount > 1000 and TheDate eq datetime'2008-07-10T00:00:00Z'. Finally, our task - is to convert .NET Expression to this query syntax.

DISCLAIMER: There is Nuget package - WindowsAzure.Storage (https://www.nuget.org/packages/WindowsAzure.Storage/), which already has this functionality, but our main goal is not to create traslation to concrete language - Azure Storage Tables syntax, but instead - understand how to do that and thus learn how to build queries to arbitrary language or solve another custom problem.

Solution

Each Expression - is a tree with nodes, each of them also is Expression. Moreover, each node belongs to certain type of Expression, which is inherited from base one. But when we create Expression tree, compiler can't tell us with what concrete type we deal with, because there is an infinite number of possible combinations (trees structures), for example, BinaryExpression has two child nodes, each of them can belong to any Expression type. So actual types of root node and its children as well as tree's structure will be known only at runtime. At compile time, nodes will be presented as base Expression type and structure as references from parent nodes to children.

If we will try to solve our problem directly, we should recursively traversal Expression tree, process each node depending on its type, i.e., we should figure out with which actual type we work and then process node respectively. For that, we should use reflection (expression.GetType(), etc). So it will be a very difficult and error prone approach. But fortunately, there is type: ExpressionVisitor.

Visitor Pattern

Idea of visitor pattern - we have type, and want to add more functionality to it, but without modifying. For that, we will create Visitor, which will contain this functionality. Each visitor can add new capabilities not only to particular type but to several ones (ElementB, ElementA), inherited from base one (Element). So it is exactly our case, main type - Expression (Element), inherited types: BinaryExpression (ElementB), UnaryExpression(ElementA), etc.; new functionality - query building (visitElementB, visitElementA). Expression (Element) type will accept visitor and then will call corresponding visitor's method: VisitUnary (visitElementB) or VisitBinary (visitElementA), depending on own type and will pass itself to it as argument. This way, we will have visitor's methods, where parameter will not belong to base abstract Expression type, but instead to actual one. So we should not detect type via reflection at runtime via ugly expression.GetType(), moreover ExpressionVisitor also provides functionality to recursively traversal tree in a simple manner. That means it is exactly our choice.

Implementation

Suppose we have class Order:

public class Order 
{
    public DateTime TheDate { get; set; }
    public string Customer { get; set; }
    public decimal Amount { get; set; }
    public bool Discount { get; set; }
}   

and we want to write business code like this:

var date = DateTime.Today;
var order = new Order { Customer = "Tom", Amount = 1000 };

Expression<Func<Order, bool>> filter = x => (x.Customer == order.Customer && x.Amount > order.Amount) 
                                            || 
                                            (x.TheDate == date && !x.Discount);
var visitor = new FilterConstructor();
var query = visitor.GetQuery(filter);
var data = AzureTableReference.Get<Order>(query);
//desired formatted query:
//(
//    (
//        (Customer eq 'Tom') 
//        and 
//        (Amount gt 1000)
//    ) 
//    or 
//    (
//        (TheDate eq datetime'2018-04-27T21:00:00.000Z') 
//        and 
//        not (Discount)
//    )
//)

Let's create FilterConstructor class inherited from ExpressionVisitor:

public class FilterConstructor : ExpressionVisitor
{            
    private static readonly Dictionary<ExpressionType, string> _logicalOperators;
    private static readonly Dictionary<Type, Func<object, string>> _typeConverters;
    static FilterConstructor()
    {
        //mappings for table, shown above
        _logicalOperators = new Dictionary<ExpressionType, string>
        {
            [ExpressionType.Not] = "not",
            [ExpressionType.GreaterThan] = "gt",
            [ExpressionType.GreaterThanOrEqual] = "ge",
            [ExpressionType.LessThan] = "lt",
            [ExpressionType.LessThanOrEqual] = "le",
            [ExpressionType.Equal] = "eq",
            [ExpressionType.Not] = "not",
            [ExpressionType.AndAlso] = "and",
            [ExpressionType.OrElse] = "or"
        };

        //if type is string we will wrap it into single quotes
        //if it is a DateTime we will format it like datetime'2008-07-10T00:00:00Z'
        //bool.ToString() returns "True" or "False" with first capital letter, so .ToLower() is applied
        //if it is one of the rest "simple" types we will just call .ToString() method on it
        _typeConverters = new Dictionary<Type, Func<object, string>>
        {
            [typeof(string)] = x => $"'{x}'",
            [typeof(DateTime)] = 
              x => $"datetime'{((DateTime)x).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffZ")}'",
            [typeof(bool)] = x => x.ToString().ToLower()
        };
    }

    private StringBuilder _queryStringBuilder;
    private Stack<string> _fieldNames;
    public FilterConstructor()
    {
        //here we will collect our query
        _queryStringBuilder = new StringBuilder();
        //will be discussed below
        _fieldNames = new Stack<string>();
    }

    //entry point
    public string GetQuery(LambdaExpression predicate)
    {
        //Visit transfer abstract Expression to concrete method, like VisitUnary
        //it's invocation chain (at case of unary operator) approximetely looks this way:
        //inside visitor: predicate.Body.Accept(ExpressionVisitor this)
        //inside expression(visitor is this from above): visitor.VisitUnary(this) 
        //here this is Expression
        //we not pass whole predicate, just Body, because we not need predicate.Parameters: "x =>" part
        Visit(predicate.Body);
        var query = _queryStringBuilder.ToString();
        _queryStringBuilder.Clear();
                
        return query;
    }

    protected override Expression VisitUnary(UnaryExpression node)
    {                
        //assume we only allow not (!) unary operator:
        if(node.NodeType != ExpressionType.Not)
            throw new NotSupportedException("Only not(\"!\") unary operator is supported!");

        _queryStringBuilder.Append($"{_logicalOperators[node.NodeType]} ");//!

        _queryStringBuilder.Append("(");                                   //(!
        //go down from a tree
        Visit(node.Operand);                                               //(!expression
        _queryStringBuilder.Append(")");                                   //(!expression)

        //we should return expression, it will allow to create new expression based on existing one,
        //but, at our case, it is not needed, so just return initial node argument
        return node;
    }

    //corresponds to: and, or, greater than, less than, etc.
    protected override Expression VisitBinary(BinaryExpression node)
    {
        _queryStringBuilder.Append("(");                                    //(
        //left side of binary operator
        Visit(node.Left);                                                   //(leftExpr

        _queryStringBuilder.Append($" {_logicalOperators[node.NodeType]} ");//(leftExpr and

       //right side of binary operator
        Visit(node.Right);                                                  //(leftExpr and RighExpr
        _queryStringBuilder.Append(")");                                    //(leftExpr and RighExpr)

        return node;
    }

    protected override Expression VisitMember(MemberExpression node)
    {
        //corresponds to: order.Customer, .order, today variables
        //when we pass parameters to expression via closure, CLR internally creates class:

        //class NameSpace+<>c__DisplayClass12_0
        //{
        //    public Order order;
        //    public DateTime today;
        //}

        //which contains values of parameters. When we face order.Customer, it's node.Expression
        //will not have reference to value "Tom", but instead reference to parent (.order), so we
        //will go to it via Visit(node.Expression) and also save node.Member.Name into 
        //Stack(_fieldNames) fo further usage. order.Customer has type ExpressionType.MemberAccess. 
        //.order - ExpressionType.Constant, because it's node.Expression is ExpressionType.Constant
        //(VisitConstant will be called) that is why we can get it's actual value(instance of Order). 
        //Our Stack at this point: "Customer" <- "order". Firstly we will get "order" field value, 
        //when it will be reached, on NameSpace+<>c__DisplayClass12_0 class instance
        //(type.GetField(fieldName)) then value of "Customer" property
        //(type.GetProperty(fieldName).GetValue(input)) on it. We started from 
        //order.Customer Expression then go up via reference to it's parent - "order", get it's value 
        //and then go back - get value of "Customer" property on order. Forward and backward
        //directions, at this case, reason to use Stack structure

        if(node.Expression.NodeType == ExpressionType.Constant 
           || 
           node.Expression.NodeType == ExpressionType.MemberAccess)
        {
            _fieldNames.Push(node.Member.Name);
            Visit(node.Expression);
        }                    
        else                                    
            //corresponds to: x.Customer - just write "Customer"
            _queryStringBuilder.Append(node.Member.Name);                 
        return node;
    }

    //corresponds to: 1, "Tom", instance of NameSpace+<>c__DisplayClass12_0, instance of Order, i.e.
    //any expression with value
    protected override Expression VisitConstant(ConstantExpression node)
    {
        //just write value
        _queryStringBuilder.Append(GetValue(node.Value));
        return node;
    }

    private string GetValue(object input)
    {
        var type = input.GetType();
        //if it is not simple value
        if (type.IsClass && type != typeof(string))
        {
            //proper order of selected names provided by means of Stack structure
            var fieldName = _fieldNames.Pop();
            var fieldInfo = type.GetField(fieldName);
            object value;
            if (fieldInfo != null)            
               //get instance of order    
                value = fieldInfo.GetValue(input);
            else
                //get value of "Customer" property on order
                value = type.GetProperty(fieldName).GetValue(input);
            return GetValue(value);
        }                    
        else
        {
            //our predefined _typeConverters
            if (_typeConverters.ContainsKey(type))
                return _typeConverters[type](input);
            else
            //rest types
                return input.ToString();
        }
    }

Conclusion

Our goal was to learn how to traversal expression tree with the intention to translate it to arbitrary language, in our case, Azure Table Storage conditions syntax. Direct approach is too complicated and error prone, but with the help of visitor pattern, we can make it easily. Fortunately, .NET Framework provides to us class ExpressionVisitor, which dedicated exactly to Expressions and has all needed functionality. We should just inherit from it and implement our custom logic.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here