Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Invent your own Dynamic LINQ parser

4.95/5 (28 votes)
14 Oct 2014MIT4 min read 82.6K   243  
This is an alternative for Dynamically generate a LINQ query with a custom property.

Introduction

This is an alternative to the original Tip Dynamically generate a LINQ query with a custom property.

The original post follows the concept of tweaking with the client of a query to allow dynamic queries. I find that quite intrusive and not re-usable for future use as I commented on the original post.

I outline here how you can re-invent your own Dynamic Linq version. I present some parsing techniques that might be helpful for other simple parser tasks. For "real" parser works, I strongly recommend using a parser generator like Coco/R or like ANTLR. See also an example of a Coco/R implementation of a simple math expression parser.

The functionality:

Create from an expression that is given in a string a function that can be used in a LINQ query, e.g.

C#
IEnumerable<MySourceElement> items = ...;
...
string s = GetUserEntry(); // e.g. Name == "x" || Number >= 800
...
var pred = SimpleExpression.PredicateParser<MySourceElement>.Parse(s);
var f = pred.Compile();
var query = from e in items where f(e) select e;
...

Using the code

This is a very dense code that shows how you could write your own Dynamic LINQ parser:

  • the scanner is from line 11 to line 54
  • the code generator is from line 57 to line 108
  • the parser from is line 109 to line 149

The implemented features:

  • names as properties or fields of the lambda parameter
  • double or int numbers
  • strings
  • nested expressions
  • numeric, string, boolean type system with numeric type promotion
  • operators ||, &&, ==, !=, <, <=, >=, >, !
C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Text.RegularExpressions;
 
namespace SimpleExpression
{
    public abstract class PredicateParser
    {
        #region scanner
        /// <summary>tokenizer pattern: Optional-SpaceS...Token...Optional-Spaces</summary>
        private static readonly string _pattern = @"\s*(" + string.Join("|", new string[]
        {
            // operators and punctuation that are longer than one char: longest first
            string.Join("|", new string[] { "||", "&&", "==", "!=", "<=", ">=" }.Select(e => Regex.Escape(e))),
            @"""(?:\\.|[^""])*""",  // string
            @"\d+(?:\.\d+)?",       // number with optional decimal part
            @"\w+",                 // word
            @"\S",                  // other 1-char tokens (or eat up one character in case of an error)
        }) + @")\s*";
        /// <summary>get 1st char of current token (or a Space if no 1st char is obtained)</summary>
        private char Ch { get { return string.IsNullOrEmpty(Curr) ? ' ' : Curr[0]; } }
        /// <summary>move one token ahead</summary><returns>true = moved ahead, false = end of stream</returns>
        private bool Move() { return _tokens.MoveNext(); }
        /// <summary>the token stream implemented as IEnumerator&lt;string&gt;</summary>
        private IEnumerator<string> _tokens;
        /// <summary>constructs the scanner for the given input string</summary>
        protected PredicateParser(string s)
        {
            _tokens = Regex.Matches(s, _pattern, RegexOptions.Compiled).Cast<Match>()
                      .Select(m => m.Groups[1].Value).GetEnumerator();
            Move();
        }
        protected bool IsNumber { get { return char.IsNumber(Ch); } }
        protected bool IsDouble { get { return IsNumber && Curr.Contains('.'); } }
        protected bool IsString { get { return Ch == '"'; } }
        protected bool IsIdent { get { char c = Ch; return char.IsLower(c) || char.IsUpper(c) || c == '_'; } }
        /// <summary>throw an argument exception</summary>
        protected void Abort(string msg) { throw new ArgumentException("Error: " + (msg ?? "unknown error")); }
        /// <summary>get the current item of the stream or an empty string after the end</summary>
        protected string Curr { get { return _tokens.Current ?? string.Empty; }}
        /// <summary>get current and move to the next token (error if at end of stream)</summary>
        protected string CurrAndNext { get { string s = Curr; if (!Move()) Abort("data expected"); return s; } }
        /// <summary>get current and move to the next token if available</summary>
        protected string CurrOptNext { get { string s = Curr; Move(); return s; } }
        /// <summary>moves forward if current token matches and returns that (next token must exist)</summary>
        protected string CurrOpAndNext(params string[] ops)
        {
            string s = ops.Contains(Curr) ? Curr : null;
            if (s != null && !Move()) Abort("data expected");
            return s;
        }
        #endregion
    }
    public class PredicateParser<TData>: PredicateParser
    {
        #region code generator
        private static readonly Type _bool = typeof(bool);
        private static readonly Type[] _prom = new Type[]
        { typeof(decimal), typeof(double), typeof(float), typeof(ulong), typeof(long), typeof(uint),
          typeof(int), typeof(ushort), typeof(char), typeof(short), typeof(byte), typeof(sbyte) };
        /// <summary>enforce the type on the expression (by a cast) if not already of that type</summary>
        private static Expression Coerce(Expression expr, Type type)
        {
            return expr.Type == type ? expr : Expression.Convert(expr, type);
        }
        /// <summary>casts if needed the expr to the "largest" type of both arguments</summary>
        private static Expression Coerce(Expression expr, Expression sibling)
        {
            if (expr.Type != sibling.Type)
            {
                Type maxType = MaxType(expr.Type, sibling.Type);
                if (maxType != expr.Type) expr = Expression.Convert(expr, maxType);
            }
            return expr;
        }
        /// <summary>returns the first if both are same, or the largest type of both (or the first)</summary>
        private static Type MaxType(Type a, Type b) { return a==b?a:(_prom.FirstOrDefault(t=>t==a||t==b)??a); }
        /// <summary>
        /// Code generation of binary and unary epressions, utilizing type coercion where needed
        /// </summary>
        private static readonly Dictionary<string, Func<Expression, Expression, Expression>> _binOp =
            new Dictionary<string,Func<Expression,Expression,Expression>>()
        {
            { "||", (a,b)=>Expression.OrElse(Coerce(a, _bool), Coerce(b, _bool)) },
            { "&&", (a,b)=>Expression.AndAlso(Coerce(a, _bool), Coerce(b, _bool)) },
            { "==", (a,b)=>Expression.Equal(Coerce(a,b), Coerce(b,a)) },
            { "!=", (a,b)=>Expression.NotEqual(Coerce(a,b), Coerce(b,a)) },
            { "<", (a,b)=>Expression.LessThan(Coerce(a,b), Coerce(b,a)) },
            { "<=", (a,b)=>Expression.LessThanOrEqual(Coerce(a,b), Coerce(b,a)) },
            { ">=", (a,b)=>Expression.GreaterThanOrEqual(Coerce(a,b), Coerce(b,a)) },
            { ">", (a,b)=>Expression.GreaterThan(Coerce(a,b), Coerce(b,a)) },
        };
        private static readonly Dictionary<string, Func<Expression, Expression>> _unOp =
            new Dictionary<string, Func<Expression, Expression>>()
        {
            { "!", a=>Expression.Not(Coerce(a, _bool)) },
        };
        /// <summary>create a constant of a value</summary>
        private static ConstantExpression Const(object v) { return Expression.Constant(v); }
        /// <summary>create lambda parameter field or property access</summary>
        private MemberExpression ParameterMember(string s) { return Expression.PropertyOrField(_param, s); }
        /// <summary>create lambda expression</summary>
        private Expression<Func<TData, bool>> Lambda(Expression expr) { return Expression.Lambda<Func<TData, bool>>(expr, _param); }
        /// <summary>the lambda's parameter (all names are members of this)</summary>
        private readonly ParameterExpression _param = Expression.Parameter(typeof(TData), "_p_");
        #endregion
        #region parser
        /// <summary>initialize the parser (and thus, the scanner)</summary>
        private PredicateParser(string s): base(s) { }
        /// <summary>main entry point</summary>
        public static Expression<Func<TData, bool>> Parse(string s) { return new PredicateParser<TData>(s).Parse(); }
        private Expression<Func<TData, bool>> Parse() { return Lambda(ParseExpression()); }
        private Expression ParseExpression() { return ParseOr(); }
        private Expression ParseOr()         { return ParseBinary(ParseAnd, "||"); }
        private Expression ParseAnd()        { return ParseBinary(ParseEquality, "&&"); }
        private Expression ParseEquality()   { return ParseBinary(ParseRelation, "==", "!="); }
        private Expression ParseRelation()   { return ParseBinary(ParseUnary, "<", "<=", ">=", ">"); }
        private Expression ParseUnary()      { return CurrOpAndNext("!") != null ? _unOp["!"](ParseUnary())
                                               : ParsePrimary(); }
        private Expression ParseIdent()      { return ParameterMember(CurrOptNext); }
        private Expression ParseString()     { return Const(Regex.Replace(CurrOptNext, "^\"(.*)\"$",
                                               m => m.Groups[1].Value)); }
        private Expression ParseNumber()     { if (IsDouble) return Const(double.Parse(CurrOptNext));
                                               return Const(int.Parse(CurrOptNext)); }
        private Expression ParsePrimary()
        {
            if (IsIdent) return ParseIdent();
            if (IsString) return ParseString();
            if (IsNumber) return ParseNumber();
            return ParseNested();
        }
        private Expression ParseNested()
        {
            if (CurrAndNext != "(") Abort("(...) expected");
            Expression expr = ParseExpression();
            if (CurrOptNext != ")") Abort("')' expected");
            return expr;
        }
        /// <summary>generic parsing of binary expressions</summary>
        private Expression ParseBinary(Func<Expression> parse, params string[] ops)
        {
            Expression expr = parse();
            string op;
            while ((op = CurrOpAndNext(ops)) != null) expr = _binOp[op](expr, parse());
            return expr;
        }
        #endregion
    }
}

This program calls the entry point of the parser above and runs various queries employing the calculated expression:

C#
static void Main(string[] args)
{
    var items = new List<Element>()
    {
        new Element("a", 1000),
        new Element("b", 900),
        new Element("c", 800),
        new Element("d", 700),
        new Element("e", 600),
        new Element("x", 500),
        new Element("y", 400),
        new Element("z", 300),
    };

    string s = "Name == \"x\" || Number >= 800";
    var pred = SimpleExpression.PredicateParser<Element>.Parse(s);
    Console.WriteLine("User Entry: {0}", s);
    Console.WriteLine("Expr Tree:  {0}", pred.ToString());
    var f = pred.Compile();
    Console.WriteLine("### mark affected items ###");
    foreach (var item in items)
    {
        Console.WriteLine("{2} Name = {0}, Number = {1}", item.Name, item.Number, f(item) ? "x" : " ");
    }
    Console.WriteLine("### where-select ###");
    var q = from e in items where f(e) select e;
    foreach (var item in q)
    {
        Console.WriteLine("  Name = {0}, Number = {1}", item.Name, item.Number);
    }
}

The output is:

User Entry: Name == "x" || Number >= 800
Expr Tree:  _p_ => ((_p_.Name == "x") OrElse (_p_.Number >= 800))
### mark affected items ###
x Name = a, Number = 1000
x Name = b, Number = 900
x Name = c, Number = 800
  Name = d, Number = 700
  Name = e, Number = 600
x Name = x, Number = 500
  Name = y, Number = 400
  Name = z, Number = 300
### where-select ###
  Name = a, Number = 1000
  Name = b, Number = 900
  Name = c, Number = 800
  Name = x, Number = 500 

Extending the Parser

Add more operations

To extend the parser: You may easily add more to the expressions, especially new operators is a simple thing:

  • to add + and - binary operators, add them to the _binOp dictionary (similar to ==, e.g. , ("+": Expression.Add(...), "-": Expression.Subtract(...)) create ParseSum() as a copy of ParseRelation, pass "+", "-" as ops, pass ParseSum to ParseRelation (in place of the ParseUnary), pass ParseUnary to ParseSum. That's it.
  • likewise for "*", "/", "%": make ParseMul as copy of the above mentioned ParseSum, pass the right ParseXXX actions, add the respective Expression factories to the _binOps dictionary. Done.
  • An unary "-" is to be added in the _unOps dictionary (no coercion needed). The parsing is done in the ParseUnary() function, e.g.

...

C#
    { ">", (a,b)=>Expression.GreaterThan(Coerce(a,b), Coerce(b,a)) },
    { "+", (a,b)=>Expression.Add(Coerce(a,b), Coerce(b,a)) },
    { "-", (a,b)=>Expression.Subtract(Coerce(a,b), Coerce(b,a)) },
    { "*", (a,b)=>Expression.Multiply(Coerce(a,b), Coerce(b,a)) },
    { "/", (a,b)=>Expression.Divide(Coerce(a,b), Coerce(b,a)) },
    { "%", (a,b)=>Expression.Modulo(Coerce(a,b), Coerce(b,a)) },
};
private static readonly Dictionary<string, Func<Expression, Expression>> _unOp =
               new Dictionary<string, Func<Expression, Expression>>()
{
    { "!", a=>Expression.Not(Coerce(a, _bool)) },
    { "-", a=>Expression.Negate(a) },
};

...

C#
private Expression ParseRelation() { return ParseBinary(ParseSum, "<", "<=", ">=", ">"); }
private Expression ParseSum()      { return ParseBinary(ParseMul, "+", "-"); }
private Expression ParseMul()      { return ParseBinary(ParseUnary, "*", "/", "%"); }
private Expression ParseUnary()
{
    if (CurrOpAndNext("!") != null) return _unOp["!"](ParseUnary());
    if (CurrOpAndNext("-") != null) return _unOp["-"](ParseUnary());
    return ParsePrimary();
}

Base Comparison on IComparable<T> instead of implicit operators

The original parser performs equality and comparison checks on direct operators. That a broader approach was to call a.CompareTo(b) rel 0, e.g. a.CompareTo(b) < 0 instead of a < b. The benefit is that you can use a wider range of types in the comparisons. To achieve that, add/replace the following code:

  1. Add before line 59:
    C#
    private static readonly Expression _zero = Expression.Constant(0);
  2. Add after line 79:
    C#
    /// <summary>produce comparison based on IComparable types</summary>
    private static Expression CompareToExpression(Expression lhs, Expression rhs, Func<Expression, Expression> rel)
    {
        lhs = Coerce(lhs, rhs);
        rhs = Coerce(rhs, lhs);
        Expression cmp = Expression.Call(
            lhs,
            lhs.Type.GetMethod("CompareTo", new Type[] { rhs.Type })
                ?? lhs.Type.GetMethod("CompareTo", new Type[] { typeof(object) }),
            rhs);
        return rel(cmp);
    }
    
  3. Replace lines 88-93 by the following:
    C#
    { "==", (a,b)=>CompareToExpression(a, b, c=>Expression.Equal             (c, _zero)) },
    { "!=", (a,b)=>CompareToExpression(a, b, c=>Expression.NotEqual          (c, _zero)) },
    { "<",  (a,b)=>CompareToExpression(a, b, c=>Expression.LessThan          (c, _zero)) },
    { "<=", (a,b)=>CompareToExpression(a, b, c=>Expression.LessThanOrEqual   (c, _zero)) },
    { ">=", (a,b)=>CompareToExpression(a, b, c=>Expression.GreaterThanOrEqual(c, _zero)) },
    { ">",  (a,b)=>CompareToExpression(a, b, c=>Expression.GreaterThan       (c, _zero)) },

Add support for nested identifiers

As metbone suggested, we can extend the parser to also support nested identifiers in the form a.b.c. This is best done by replacing the ParseIdent() method in line 122 by the following code:

C#
/// parsing single or nested identifiers. EBNF: ParseIdent = ident { "." ident } .
private Expression ParseIdent()
{
    Expression expr = ParameterMember(CurrOptNext);
    while (CurrOpAndNext(".") != null && IsIdent) expr = Expression.PropertyOrField(expr, CurrOptNext);
    return expr;
}

Points of Interest

As mentioned, check out the available parser generators to do real work with parsers.

LINQ Expression Tree

Dynamic LINQ

Any feedback is very much appreciated.

Please note: this is meant as alternative to the given tip. This alternative does not require the data provider to use special kind of properties or fields. For real work on this, checkout the available Dynamic LINQ implementaiton.

Have fun!

Andi

History

V1.02012-03-28
First version
V1.12012-03-29
fix typo on line 107, simplified ParsePrimary, fixed ParseNumber to correctly distinguish between int and double, some minor code formatting adjustments, add sources download link, update tags (.NET4)
V1.22012-03-29
added sample code for showing how easily extending the expression parser
early adopter of the brand new code rendering feature:
HTML
<pre ... countlines="true" countstart="93">...
(thanks Chris Maunder[^] for adding this feature[^] so quickly!)
V1.32012-04-06
Fix Regex for string (properly handles embedded double quotes) and number (decimal part is optional)
V1.42012-09-04
Fix some minor typos
V1.52013-05-20
Added description on how to extend the parser to base comparison on IComparable<T> instead of direct relation and equality operators
V1.62014-03-04
Added a link to simple math expression parser, fixed some typos, fix broken highlighting, changed type
V1.72014-10-14
add support for nested identifiers (no function call support nor index access implemented yet, though)

License

This article, along with any associated source code and files, is licensed under The MIT License