Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / XML

Translating C# Lambda Expressions to General Purpose Filter Descriptors & HTTP Query Strings

4.98/5 (25 votes)
22 May 2020CPOL11 min read 39.8K   554  
An idea on how to use .NET ExpressionVisitor to translate lambda expressions into classes that encapsulate data suitable for filtering data & creating query strings
C# lambda expressions provide a convenient and concise way of describing a condition for filtering almost any type of data in a C# program. But, converting a given lambda expression to other forms used in other domains, such as an HTTP request, can be tedious and tricky. This article aims to provide a way for this issue using ExpressionVisitor class of .NET framework.

Introduction

It is relatively common to build a mechanism for converting business conditions expressed in code to a format usable by other tiers of a solution, be it a database or a Web service, especially in the infrastructure layers of a solution. Either of the two following common scenarios is an example of such case:

  1. Suppose we want to pass filter conditions from inside a C# client to an HTTP service. These conditions can be sent in a query string collection, but manually constructing query strings through string concatenation, not only does not seem nice and clean, but would highly likely be hard to debug and maintain.
  2. There can be times that we might want to translate filter conditions into SQL WHERE clauses without using an ORM tool. Again, constructing SQL WHERE clauses for database queries through manual string manipulation seems error prone and hard to maintain.

As an elegant tool, "lambda expressions" give a concise and convenient means of describing filter conditions but working with these expressions is not very easy. Luckily, ExpressionVisitor class in System.Linq.Expressions namespace is an excellent tool for inspecting, modifying and translating lambda expressions.

In this article, we mainly use ExpressionVisitor class to propose a solution to the first scenario above.

Background

Before diving into the details, let us have a very brief introduction to the general concept of expressions, then the condition expressions as a more special type, and finally a very short description of ExpressionVisitor class. It will be very short but absolutely necessary, so please skip this section only if you know these subjects beforehand.

What Are Expressions in General and How Are Condition Expressions Different From Them?

Expressions in general represent a delegate or method. An expression itself is not a delegate or method. It represents a delegate or method, i.e., an expression defines a delegate's structure. In .NET platform, we use Expression class to define an expression. However, before defining its delegate's body, it is necessary to define the signature of the delegate it is going to represent. This signature is given to the Expression class via a generic type parameter named TDelegate. Therefore, the form of the expression class is Expression<TDelegate>.

Having this in mind, it is obvious that a condition expression represents a delegate that takes an object of an arbitrary type T as the input and returns a Boolean value. As a result, the delegate of a condition expression will be of type Func<T, bool>, hence Expression<Func<T, bool>> the type of the condition expression.

How ExpressionVistor Works

We usually use a lambda expression to define an expression. A lambda expression consists of multiple different expressions combined together. Consider this example lambda:

C#
p => p.Price < 1000 && p.Name.StartsWith("a-string") && !p.OutOfStock

The below figure marks different parts of it:

Image 1

As you can see, this expression is a combination of some other expressions and operators.

Now let us see how ExpressionVisitor treats an expression like the one above. This class implements visitor pattern. Its main method, or entry point, called Visit is a dispatcher that calls several other specialized methods. When an expression is passed to the Visit method, the expression tree is traversed and depending on the type of each node, a specialized method is called to visit (inspect and modify) that node, and its children, if any. Inside each method, if the expression is modified, the modified copy of it will be returned; otherwise, the original expression. Please keep in mind that expressions are immutable and any modification would result in a new instance being built and returned.

In Microsoft’s online documentation for .NET framework 4.8, 35 special visit methods are documented. A few interesting ones that are used in our solution are listed here:

All those 35 variants of visit method are virtual and any class inheriting from ExpressionVisitor should override the necessary ones and implement its own logic. This is how a custom visitor is built.

For those readers who might be willing to obtain a good understanding of how our solution works, having at least a minimum familiarity with the following subjects is necessary.

  • Expression Trees (1) & (2)
    • A general concept behind the lambda expressions that we want to translate
  • Tree Traversals (Inorder, Preorder and Postorder)
    • Algorithms used to iterate a tree
  • Visitor Design Pattern
    • A design pattern that is used to parse expression trees
  • ExpressionVisitor Class
    • A class provided by Microsoft .NET platform that uses visitor design pattern to expose methods for inspecting, modifying and translating expression trees. We will be using these methods to inspect each node of the interest in the tree and extract needed data from it.
  • Reverse Polish Notation (RPN)
    • In reverse Polish notation, the operators follow their operands; for instance, to add 3 and 4, one would write "3 4 +" rather than "3 + 4".

The Big Picture

As the below figure shows, we have a FilterBuilder class that takes an expression of type Expression<Func<T, bool>> as the input. This class is the main part of the solution. At the first step, the FilterBuilder examines the input expression and outputs a collection of FilterDescriptors (IEnumerable<FilterDescriptor>). In the next step, a converter, converts this collection of FilterDescriptors to a desired form, e.g., query string key-value pairs to be used in an HTTP request, or a string to be used as a SQL WHERE clause. For every type of conversion, a separate converter is needed.

Image 2

One question may arise here: why do we not convert the input expression directly to a query string? Is it necessary to take on the burden of generating FilterDescriptors? Can this extra step be skipped? The answer is if all you need is to generate query strings and no more than that, and if you are not looking for a general solution, you are free to do so. However, this way, you will end up having a very specific ExpressionVisitor suitable for only one type of output. For that goal, a good article is written here. However, what this article tries to do is exactly the opposite: proposing a more general solution.

The Solution

Foundations

At the heart of our solution is the FilterBuilder class, which inherits from ExpressionVisitor. The constructor of this class takes an expression of type Expresion<Func<T, bool>>. This class has a public method named Build that returns a collection of FilterDescriptor objects. FiterDescriptor is defined as follows:

C#
public class FilterDescriptor
{
    public FilterDescriptor()
    {
        CompositionOperator = FilterOperator.And;
    }

    private FilterOperator _compositionOperator;

    public FilterOperator CompositionOperator
    {
        get => _compositionOperator;
        set
        {
            if (value != FilterOperator.And && value != FilterOperator.Or)
                throw new ArgumentOutOfRangeException();

            _compositionOperator = value;
        }
    }

    public string FieldName { get; set; }
    public object Value { get; set; }
    public FilterOperator Operator { get; set; }     

    // For demo purposes
    public override string ToString()
    {
        return
            $"{CompositionOperator} {FieldName ?? "FieldName"} {Operator} {Value ?? "Value"}";

    }
}

Type of the FilterOperator property of this class is an enumeration. This property specifies the operator of the filter.

C#
public enum FilterOperator
{
    NOT_SET,

    // Logical
    And,
    Or,
    Not,

    // Comparison
    Equal,
    NotEqual,
    LessThan,
    LessThanOrEqual,
    GreaterThan,
    GreaterThanOrEqual,

    // String
    StartsWith,
    Contains,
    EndsWith,
    NotStartsWith,
    NotContains,
    NotEndsWith
}

Expressions nodes are not directly converted to FilterDescriptor objects. Instead, each overridden method that visits an expression node, creates an object named token and adds it to a private list. Tokens in this list are arranged according to Reverse Polish Notation (RPN). What is a token? A token encapsulates node data required to build FilterDescriptors. Tokens are defined by classes that inherit from an abstract Token class.

C#
public abstract class Token {}

public class BinaryOperatorToken : Token
{
    public FilterOperator Operator { get; set; }

    public BinaryOperatorToken(FilterOperator op)
    {
        Operator = op;
    }

    public override string ToString()
    {
        return "Binary operator token:\t" + Operator.ToString();
    }
}

public class ConstantToken : Token
{
    public object Value { get; set; }

    public ConstantToken(object value)
    {
        Value = value;
    }

    public override string ToString()
    {
        return "Constant token:\t\t" + Value.ToString();
    }
}

public class MemberToken : Token
{
    public Type Type { get; set; }

    public string MemberName { get; set; }

    public MemberToken(string memberName, Type type)
    {
        MemberName = memberName;
        Type = type;
    }

    public override string ToString()
    {
        return "Member token:\t\t" + MemberName;
    }
}

public class MethodCallToken : Token
{
    public string MethodName { get; set; }

    public MethodCallToken(string methodName)
    {
        MethodName = methodName;
    }

    public override string ToString()
    {
        return "Method call token:\t" + MethodName;
    }
}

public class ParameterToken : Token
{
    public string ParameterName { get; set; }
    public Type Type { get; set; }

    public ParameterToken(string name, Type type)
    {
        ParameterName = name;
        Type = type;
    }

    public override string ToString()
    {
        return "Parameter token:\t\t" + ParameterName;
    }
}

public class UnaryOperatorToken : Token
{
    public FilterOperator Operator { get; set; }

    public UnaryOperatorToken(FilterOperator op)
    {
        Operator = op;
    }

    public override string ToString()
    {
        return "Unary operator token:\t\t" + Operator.ToString();
    }
}

After all nodes of the expression are traversed and their equivalent tokens are created, FilterDescriptors can be created. This will be done by calling a method named Build.

As stated before in "How ExpressionVisitor Works" section, every part of the expression comprises multiple subexpressions. For example, p.Price < 1000 is a binary expression that is made up of the three parts:

  1. p.Price (member expression)
  2. < ("less than" binary operator)
  3. 1000 (constant expression)

When visited, this 3-part binary expression will produce three different tokens:

  1. A MemberToken for p.Price by VisitMember method
  2. A BinaryOperatorToken for < by VisitBinary method
  3. A ConstantToken for 1000 by VisitConstant method

When the Builder method is called, it first creates a Stack<FilterDescriptor> object. Then iterates over the tokens list and based on the type of the current token in the loop, pushes and pops descriptors to and from the stack. This way different tokens, like the three ones in the above example, are combined together to build a single FilterDescriptor.

C#
public IEnumerable<FilterDescriptor> Build()
{
    var filters = new Stack<FilterDescriptor>();

    for (var i = 0; i < _tokens.Count; i++)
    {
        var token = _tokens[i];

        switch (token)
        {
            case ParameterToken p:
                var f = getFilter();
                f.FieldName = p.ParameterName;
                filters.Push(f);
                break;

            case BinaryOperatorToken b:
                var f1 = getFilter();

                switch (b.Operator)
                {
                    case FilterOperator.And:
                    case FilterOperator.Or:
                        var ff = filters.Pop();
                        ff.CompositionOperator = b.Operator;
                        filters.Push(ff);
                        break;

                    case FilterOperator.Equal:
                    case FilterOperator.NotEqual:
                    case FilterOperator.LessThan:
                    case FilterOperator.LessThanOrEqual:
                    case FilterOperator.GreaterThan:
                    case FilterOperator.GreaterThanOrEqual:
                        f1.Operator = b.Operator;
                        filters.Push(f1);
                        break;
                }

                break;

            case ConstantToken c:
                var f2 = getFilter();
                f2.Value = c.Value;
                filters.Push(f2);
                break;

            case MemberToken m:
                var f3 = getFilter();
                f3.FieldName = m.MemberName;
                filters.Push(f3);
                break;

            case UnaryOperatorToken u:
                var f4 = getFilter();
                f4.Operator = u.Operator;
                f4.Value = true;
                filters.Push(f4);
                break;

            case MethodCallToken mc:
                var f5 = getFilter();
                f5.Operator = _methodCallMap[mc.MethodName];
                filters.Push(f5);
                break;
        }
    }

    var output = new Stack<FilterDescriptor>();

    while (filters.Any())
    {
        output.Push(filters.Pop());
    }

    return output;

    FilterDescriptor getFilter()
    {
        if (filters.Any())
        {
            var f = filters.First();

            var incomplete = f.Operator == default ||
                                f.CompositionOperator == default ||
                                f.FieldName == default ||
                                f.Value == default;

            if (incomplete)
                return filters.Pop();

            return new FilterDescriptor();
        }

        return new FilterDescriptor();
    }
}

When the Build method returns, all descriptors are ready to be converted to whatever form that is needed.

Necessary Expression Modifications

Three modifications to the original expression are introduced here, which help a lot in simplifying things. These three changes are my own solution to make the code simpler and more practical. They are not theoretically necessary and one can further develop this example to solve the problem another way and keep the original expression intact.

Modifying Boolean MemberAccess Expressions

Every condition is defined with three things: a parameter, its value and an operator, which relates the parameter to that value. Now consider this expression: p.OutOfStock where OutOfStock is a Boolean property of the object p. It lacks two of the three parts at the first glance: an operator and a Boolean value; but matter of fact is that it is a short form of this expression: p.OutOfStock == true. On the other hand, the algorithm in this article expects all three parts in order to function as expected. As I have experienced, without the operator and a Boolean value explicitly stated, trying to use this kind of expression as it is, tends to add unnecessary complexity to the solution. For this reason, we visit the expression in two passes. For the first pass, a separate class named BooleanVisitor, which also inherits from ExpressionVisitor, is used. It only overrides VisitMember method. This class is privately nested in the FilterBuilder.

C#
private class BooleanVisitor : ExpressionVisitor
{
    protected override Expression VisitMember(MemberExpression node)
    {
        if (node.Type == typeof(bool))
        {
            return Expression.MakeBinary
                   (ExpressionType.Equal, node, Expression.Constant(true));
        }

        return base.VisitMember(node);
    }
}

This overridden method adds two missing parts of a Boolean member access expression to it and returns the modified copy. The second pass needs to be performed afterwards. This is done in the constructor of the FilterBuilder.

C#
// ctor of the FilterBuilder

public FilterBuilder(Expression expression)
{
    var fixer = new BooleanVisitor();
    var fixedExpression = fixer.Visit(expression);
    base.Visit(fixedExpression);
}

Modifying Negated Comparison Operators

Sometimes relation of a variable to a value in a condition contains a comparison operator combined with a negation operator. An example is !(p.Price > 30000). In such cases, replacing this combination with a single equivalent operator makes things simpler. For example, instead of a ! (not) and > (greater than) operators combined, a <= (less than or equal) operator can be used. The same is valid for string comparison operators too. Any combination of the negation operator and string comparison operators will be replaced by a single equivalent operator that is defined in the FilterOperator enumeration.

Modifying DateTime Values

Two important things should be noted here. First, DateTime values need special attention while visiting an expression tree because a DateTime value can appear in many forms in an expression. Some of those forms that are covered in this solution are:

  1. A simple MemberAccess expression: DateTime.Now or DateTime.Date
  2. Nested MemberAccess expressions: DateTime.Now.Date
  3. A NewExpression: new DateTime(1989, 3, 25)
  4. A NewExpression followed by a MemberAccess expression: new DateTime(1989, 3, 25).Date

When a DateTime value appears as a MemberAccess expression, it should be handled in the VisitMember method. When it appears as a NewExpression, it should be handled in the VisitNew method.

Second, one can transfer a DateTime value over the wire in many forms. For instance, it can be converted to a string and formatted arbitrarily; or it can be converted to a long integer (Ticks) and sent as a number. Choosing a specific data type and format is a matter of business requirement or technical constraint. Anyway, here, the Ticks property of the DateTime structure is chosen because of simplicity and also because it can be platform independent.

For these two reasons, our expression visitor replaces instances of DateTime structure, with their Ticks equivalent. This means that we have to obtain the value of the Ticks property of DateTime values when the expression visitor code is run. Thus, the expression containing the DateTime value should be compiled to a method and run as in the code below:

C#
protected override Expression VisitMember(MemberExpression node)
{
    if (node.Type == typeof(DateTime))
    {
        if (node.Expression == null) // Simple MemberAccess like DateTime.Now
        {
            var lambda = Expression.Lambda<Func<DateTime>>(node);
            var dateTime = lambda.Compile()();
            base.Visit(Expression.Constant(dateTime.Ticks));
            return node;
        }
        else
        {
            switch (node.Expression.NodeType)
            {
                case ExpressionType.New:
                    var lambda = Expression.Lambda<Func<DateTime>>(node.Expression);
                    var dateTime = lambda.Compile()();
                    base.Visit(Expression.Constant(dateTime.Ticks));
                    return node;

                case ExpressionType.MemberAccess: // Nested MemberAccess            
                    if (node.Member.Name != ((MemberExpression)node.Expression).Member.Name) 
                    {
                        var lambda2 = Expression.Lambda<Func<DateTime>>(node);
                        var dateTime2 = lambda2.Compile()();
                        base.Visit(Expression.Constant(dateTime2.Ticks));
                        return node;
                    }
                    break;
            }
        }
    }

    _tokens.Add(new MemberToken(node.Expression + "." + node.Member.Name, node.Type));
    return node;
}

protected override Expression VisitNew(NewExpression node)
{
    if (node.Type == typeof(DateTime))
    {
        var lambda = Expression.Lambda<Func<DateTime>>(node);
        var dateTime = lambda.Compile()();
        base.Visit(Expression.Constant(dateTime.Ticks));
        return node;
    }

    return base.VisitNew(node);
}

Converting FilterDescriptors

As mentioned earlier, when the Build method returns, a collection of FilterDescriptors is ready to be fed to any class or method to be converted to any desired form. In case of a query string, this method can simply be an extension method or a separate class depending on the programmer's preference. Note that every server program will be expecting a predefined set of key-value pairs. For example, suppose there is a server that will look for different parameters of filters in separate array-like key-value pairs. The following extension method will do the job.

C#
public static class FilterBuilderExtensions
{
    public static string GetQueryString(this IList<FilterDescriptor> filters)
    {
        var sb = new StringBuilder();

        for (var i = 0; i < filters.Count; i++)
        {
            sb.Append(
                $"filterField[{i}]={filters[i].FieldName}&" +
                $"filterOp[{i}]={filters[i].Operator}&" +
                $"filterVal[{i}]={filters[i].Value}&" +
                $"filterComp[{i}]={filters[i].CompositionOperator}");

            if (i < filters.Count - 1)
                sb.Append("&");
        }

        return sb.ToString();
    }
}

Example Usage

This simple console program demonstrates how the FilterBuilder can be used.

ToString method of the FilterDescriptor and all token classes is overridden so their properties can be inspected in the console.

C#
class Program
{
    static void Main(string[] args)
    {
        Expression<Func<Product, bool>> exp = p =>
            p.Id == 1009 &&
            !p.OutOfStock &&
            !(p.Price > 30000) &&
            !p.Name.Contains("BMW") &&
            p.ProductionDate > new DateTime(1999, 6, 20).Date;

        var visitor = new FilterBuilder(exp);
        var filters = visitor.Build().ToList();

        Console.WriteLine("Tokens");
        Console.WriteLine("------\n");

        foreach (var t in visitor.Tokens)
        {
            Console.WriteLine(t);
        }

        Console.WriteLine("\nFilter Descriptors");
        Console.WriteLine("------------------\n");

        foreach (var f in filters)
        {
            Console.WriteLine(f);
        }

        Console.WriteLine($"\nQuery string");
        Console.WriteLine("------------\n");
        Console.WriteLine(filters.GetQueryString());

        Console.ReadLine();
    }
}

public class Product
{
    public int Id { get; set; }
    public string Name { get; set; }
    public decimal Price { get; set; }
    public DateTime ProductionDate { get; set; }
    public bool OutOfStock { get; set; } = false;
}

The output:

Image 3

Omitted Features

There surely are many potential improvements to make this solution more robust, but they are intentionally omitted through this article for the sake of brevity. One necessary feature would be supporting parentheses in the expression through a new class that would wrap a collection of FilterDescriptors. Such features require more time and effort that might be covered at a later time. However, I hope the readers are able to grasp the core concepts presented here and develop a better solution on top of this work.

The full source code of the solution is available in the ZIP file attached to this article.

History

  • 16th March, 2020: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)