Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

A Dynamic Where Implementation for Entity Framework

4.73/5 (4 votes)
2 Apr 2023CPOL6 min read 26.7K   228  
Using EntityFramework Core and SQL Server to create a dynamic IQueryable filtering that is not susceptible to SQL injection attacks
Using EntityFramework Core and SQL Server to create a dynamic IQueryable filtering that is not susceptible to SQL injection attacks and resolves in the SQL statement rather than applied after data is returned.

Introduction

I have a requirement to filter SQL queries dynamically -- meaning I don't know the columns, comparison operator, or comparison value until runtime. After much searching, I've put together this article of which 90% of what I present here is based on ideas and code from other people's work. The "added value" in this article is that it puts all the disparate pieces together into one solution to the problem.

My first stop was this excellent article by CPian Fitim Skenderi, Build Where Clause Dynamically in Linq, (also see Appendix 2: Caveat on Skenderi's Work) which got my feet wet as to how to build dynamic queries with Linq's Expression class. The drawback with Skenderi's code is that it doesn't work with IQueryable. It filters the result of the query rather than injecting the SQL with the filter and leveraging the database's SQL engine, not to mention that the point of the filter is to reduce the set of records being returned by the database.

My second stop was a Stack Overflow post by Axel Heer. This was in some ways a cleaner implementation and worked with IQueryable, but it had two problems. The first was that the value to compare was coded as an Expression.Constant. While the filter was now in the SQL where clause, the literal strings made the SQL vulnerable to SQL injection attacks. What I needed was for the values to be added as SQL parameters.

My third stop solved that problem by a Stack Overflow post by Ivan Stoev. This involved creating an expression of a Func that returned the value and applying the expression body. Since the value is no longer a constant, it turns out that Entity Framework converts the expression body of a Func into a parameter! Voodoo magic.

However, there was now a new problem to solve. As a constant, SQL nicely handled nullable vs. non-nullable types as well as automatic data conversion. For example, I could do date comparisons on a constant like '1962-08-19' but once this value was turned into a SQL parameter, the data type of the backing model had to exactly match the data type of the table column. Solving this problem is what I've added to the work done by Skenderi, Heer, and Stoev.

The Test Schema and Models

The schema I'm using has two tables:

  • InventoryItem (Id and Name)
  • Sale (Id, InventoryItem_ID, and SaleDate)

Please see the Appendix 1: Schema for the schema.

Please note

  1. Each inventory item has zero or more sales, and what I want back is a flattened view of each inventory item and the list of dates when that item was sold.
  2. To keep things simple and unrealistic, the inventory item and sale is a one-to-many relationship; obviously, a sale would usually apply to one or more inventory items for that sale and therefore would be a many-to-many relationship, but there's no reason to create a third mapping table for the purposes of this article.
  3. While the Sale model has a foreign key to the InventoryItem, again for simplicity, I'm not implementing a collection of Sale in the InventoryItem model -- that's not the point of this article.

InventoryItem Model

C#
[Table("InventoryItems")]
public class InventoryItem
{
  [Key, DatabaseGenerated(DatabaseGeneratedOption.Identity)]
  public int Id { get; set; }
  public string Name { get; set; }
}

Sale Model

C#
[Table("Sales")]
public class Sale
{
  public int Id { get; set; }
  public int InventoryItem_Id { get; set; }
  public DateTime SaleDate { get; set; }
}

The DbContext

This is self-explanatory:

C#
public class MyContext : DbContext
{
  public DbSet<InventoryItem> InventoryItems { get; set; }
  public DbSet<Sale> Sales { get; set; }

  public MyContext(DbContextOptions<DbContext> options) : base(options)
  {
  }
}

Test Data Initialization

The test data is initialized and cleaned up as part of the unit test. In the following code, the records are deleted and two inventory items are created. Three sales are associated with the first item and two with the second inventory item:

C#
[TestClass]
public class DynamicWhereTests
{
  // Modify this for your local SQL Server instance.
  public static string connectionString = "Data Source=[your data source];
  Initial Catalog=DynamicWhere;Integrated Security=True;
  Connection Timeout=60;TrustServerCertificate=True";

  public static InventoryItem inv1;
  public static InventoryItem inv2;

  [ClassInitialize]
  public static void CreateRecords(TestContext _)
  {
    var builder = new DbContextOptionsBuilder<DbContext>().UseSqlServer(connectionString);
    using var context = new MyContext(builder.Options);
  
    context.Sales.ExecuteDelete();
    context.InventoryItems.ExecuteDelete();

    inv1 = new InventoryItem() { Name = "Item 1" };
    inv2 = new InventoryItem() { Name = "Item 2" };
    context.InventoryItems.Add(inv1);
    context.InventoryItems.Add(inv2);
    context.SaveChanges();
    context.Sales.Add(new Sale() 
    { InventoryItem_Id = inv1.Id, SaleDate = DateTime.Parse("08/19/1962") });
    context.Sales.Add(new Sale() 
    { InventoryItem_Id = inv1.Id, SaleDate = DateTime.Parse("08/20/1962") });
    context.Sales.Add(new Sale() 
    { InventoryItem_Id = inv1.Id, SaleDate = DateTime.Parse("08/21/1962") });
    context.Sales.Add(new Sale() 
    { InventoryItem_Id = inv2.Id, SaleDate = DateTime.Parse("03/21/1991") });
    context.Sales.Add(new Sale() 
    { InventoryItem_Id = inv2.Id, SaleDate = DateTime.Parse("03/22/1991") });
    context.SaveChanges();
  }
  ...

Instead of using a ClassCleanup method, the records are deleted as part of the initialization because debugging the tests and exiting the debugger doesn't run the cleanup method.

A Basic Query

We'll serialize the query results to JSON so they can be easily inspected. In the first unit test, we simply create a non-filtered query joining the two tables in a flat view:

C#
[TestMethod]
public void BasicQuery()
{
  using var context = GetContext();

  var itemSales = (from item in context.InventoryItems
    select new 
    {
      Item = item, 
      Sales = context.Sales.Where
              (s => s.InventoryItem_Id == item.Id).ToList() // FK join
    })
    .OrderBy(item => item.Item.Id)
    .ToList();

  Assert.AreEqual(2, itemSales.Count);
  Assert.AreEqual(3, itemSales[0].Sales.Count);
  Assert.AreEqual(2, itemSales[1].Sales.Count);

  var json = JsonConvert.SerializeObject(itemSales);
}

and we see in the JSON:

JavaScript
[
  {
    "Item": {
      "Id": 19,
      "Name": "Item 1"
    },
    "Sales": [
      {
        "Id": 41,
        "InventoryItem_Id": 19,
        "SaleDate": "1962-08-19T00:00:00"
      },
      {
        "Id": 42,
        "InventoryItem_Id": 19,
        "SaleDate": "1962-08-20T00:00:00"
      },
      {
        "Id": 43,
        "InventoryItem_Id": 19,
        "SaleDate": "1962-08-21T00:00:00"
      }
    ]
  },
  {
    "Item": {
      "Id": 20,
      "Name": "Item 2"
    },
    "Sales": [
      {
        "Id": 44,
        "InventoryItem_Id": 20,
        "SaleDate": "1991-03-21T00:00:00"
      },
      {
        "Id": 45,
        "InventoryItem_Id": 20,
        "SaleDate": "1991-03-22T00:00:00"
      }
    ]
  }
]

Importantly, we also see the SQL in the Output view:

SQL
Executed DbCommand (1ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
SELECT [i].[Id], [i].[Name], [s].[Id], [s].[InventoryItem_Id], [s].[SaleDate]
FROM [InventoryItems] AS [i]
LEFT JOIN [Sales] AS [s] ON [i].[Id] = [s].[InventoryItem_Id]
ORDER BY [i].[Id]

Adding the Dynamic Where

This is where it gets more interesting. We can create the Linq expression for sales filtered by a sale date >= 08/21/1962 and inventory items filtered by the first inventory item's ID:

C#
[TestMethod]
public void DynamicWhereQuery()
{
  using var context = GetContext();

  var itemFilters = new List<Filter>()
  {
    new Filter()
    {
      PropertyName= "Item.Id",
      Operation = Op.Equals,
      Value = inv1.Id
    }
  };

  var saleFilters = new List<Filter>()
  {
    new Filter()
    {
      PropertyName = "SaleDate",
      Operation = Op.GreaterThanOrEqual,
      Value = DateTime.Parse("08/21/1962")
    }
  };

  var sales = (from sale in context.Sales select sale).Where(saleFilters);

  var filteredItemSales = (from item in context.InventoryItems
  select new
  {
    Item = item,
    Sales = sales.Where(s => s.InventoryItem_Id == item.Id).ToList()  // FK join
  })
  .Where(itemFilters)
  .ToList();

  Assert.AreEqual(1, filteredItemSales.Count);
  Assert.AreEqual(1, filteredItemSales[0].Sales.Count);

  var json = JsonConvert.SerializeObject(filteredItemSales);
}

Here, we have created two filters, one for the InventoryItem record's ID and one for the Sale's SaleDate field. The sale filter is applied in the Linq expression and the inventory filter is applied when we construct the flattened view of the inventory item and its sales. The resulting JSON is:

JavaScript
[
  {
    "Item": {
      "Id": 79,
      "Name": "Item 1"
    },
    "Sales": [
      {
        "Id": 193,
        "InventoryItem_Id": 79,
        "SaleDate": "1962-08-21T00:00:00"
      }
    ]
  }
]

And we see that the SQL is performing the left join and the filtering both by the inventory item's ID and the sale's SaleDate:

SQL
Executed DbCommand (2ms) [Parameters=[@__p_0='1962-08-21T00:00:00.0000000', @__p_1='69'], 
CommandType='Text', CommandTimeout='30']
SELECT [i].[Id], [i].[Name], [t].[Id], [t].[InventoryItem_Id], [t].[SaleDate]
FROM [InventoryItems] AS [i]
LEFT JOIN (
  SELECT [s].[Id], [s].[InventoryItem_Id], [s].[SaleDate]
  FROM [Sales] AS [s]
  WHERE [s].[SaleDate] >= @__p_0
) AS [t] ON [i].[Id] = [t].[InventoryItem_Id]
WHERE [i].[Id] = @__p_1
ORDER BY [i].[Id]

Note that the ToList() is required, otherwise we get the following error:

Collections in the final projection must be an 'IEnumerable<T>' type such as 'List<T>'. Consider using 'ToList' or some other mechanism to convert the 'IQueryable<T>' or 'IOrderedEnumerable<T>' into an 'IEnumerable<T>'.

Because of this, we cannot apply both filters together because Sales has been resolved to a collection by the ToList() method -- it is no longer an expression. If we attempt to apply the filter on Sales, we get this error:

'SaleDate' is not a member of type 'System.Collections.Generic.List`1[DynamicWhere.Sale]' (Parameter 'propertyOrFieldName')

The key takeaway here is that the dynamic filter can only be applied to expressions, which makes sense because that, after all, is the whole point.

Behind the Scenes

Borrowing from Skenderi work, we have a set of operations, a Filter class, and three methods that EntityFramework knows how to convert to SQL:

C#
public enum Op
{
  Equals,
  GreaterThan,
  LessThan,
  GreaterThanOrEqual,
  LessThanOrEqual,
  Contains,
  StartsWith,
  EndsWith
}

public class Filter
{
  public string PropertyName { get; set; }
  public Op Operation { get; set; }
  public object Value { get; set; }
}

public static class ExpressionBuilder
{
  private static MethodInfo containsMethod = 
      typeof(string).GetMethod("Contains", new Type[] { typeof(string) });
  private static MethodInfo startsWithMethod = 
      typeof(string).GetMethod("StartsWith", new Type[] { typeof(string) });
  private static MethodInfo endsWithMethod = 
      typeof(string).GetMethod("EndsWith", new Type[] { typeof(string) });
  ...

The next pieces are from Heer's post, modified to accept a collection of filters:

C#
public static IQueryable<T> Where<T>(this IQueryable<T> query, List<Filter> filters)
{
  var target = Expression.Parameter(typeof(T));

  return query.Provider.CreateQuery<T>(CreateWhereClause<T>
                        (target, query.Expression, filters));
}

private static Expression CreateWhereClause<T>
 (ParameterExpression target, Expression expression, List<Filter> filters)
{
  var predicate = Expression.Lambda(CreateComparison<T>(target, filters), target);

  return Expression.Call(typeof(Queryable), nameof(Queryable.Where), 
         new[] { target.Type }, expression, Expression.Quote(predicate));
}

private static Expression CreateComparison<T>(ParameterExpression target, 
                          List<Filter> filters)
{
  Expression exp = null;

  filters.ForEach(filter =>
  {
    var memberAccess = CreateMemberAccess(target, filter.PropertyName);
    var exp2 = GetExpression<T>(memberAccess, filter);

    exp = exp == null ? exp2 : Expression.Or(exp, exp2);
  });

  return exp;
}

private static Expression CreateMemberAccess(Expression target, string selector)
{
  return selector.Split('.').Aggregate(target, Expression.PropertyOrField);
}

Very important! Note that I am building the expression by or'ing the filters together, not and'ing them together. The ability to combine "and" and "or" is beyond the scope of this article -- it is something I will address in a follow-up article.

The parts that I added handle converting the filter value to the expected model type, and in particular handling nullable data types in the model even though the filter is a not-nullable type.

C#
public static Expression GetSelector<T>(Filter filter)
{
  switch (filter.Value)
  {
    case int t1: return GetTypedSelector<T, int>(filter);
    case float f1: return GetTypedSelector<T, float>(filter);
    case double d1: return GetTypedSelector<T, double>(filter);
    case long l1: return GetTypedSelector<T, long>(filter);
    case DateTime dt1: return GetTypedSelector<T, DateTime>(filter);
    case bool b1: return GetTypedSelector<T, bool>(filter);
    case decimal d1: return GetTypedSelector<T, decimal>(filter);
    case char c1: return GetTypedSelector<T, char>(filter);
    case byte by1: return GetTypedSelector<T, byte>(filter);
    case short sh1: return GetTypedSelector<T, short>(filter);
    case ushort ush1: return GetTypedSelector<T, ushort>(filter);
    case uint ui1: return GetTypedSelector<T, uint>(filter);
    case ulong ul1: return GetTypedSelector<T, ulong>(filter);
    case string s1:
    {
      Expression<Func<string>> valueSelector = () => (string)filter.Value;
      return valueSelector.Body;
    }
    default: return null;
  }
}

public static Expression GetTypedSelector<T, R>(Filter filter) where R : struct
{
  // We actually need to get the property type, chaining from the container class,
  // and converting the value type to the property type using Expression.Convert
  var pi = GetPropertyInfo(typeof(T), filter.PropertyName);

  // This seems to be the preferred way.
  // Alternate: if (type.IsGenericType && type.GetGenericTypeDefinition() == 
  // typeof(Nullable<>))
  var propIsNullable = Nullable.GetUnderlyingType(pi.PropertyType) != null;

  Expression<Func<object>> valueSelector = () => filter.Value;
  Expression expr = propIsNullable ? Expression.Convert(valueSelector.Body, 
             typeof(R?)) : Expression.Convert(valueSelector.Body, typeof(R));

  return expr;
}

private static PropertyInfo GetPropertyInfo(Type baseType, string propertyName)
{
  string[] parts = propertyName.Split('.');

  return (parts.Length > 1)
    ? GetPropertyInfo(baseType.GetProperty(parts[0]).PropertyType, 
                      parts.Skip(1).Aggregate((a, i) => a + "." + i))
    : baseType.GetProperty(propertyName);
}

This code could probably take advantage of some sort of caching mechanism such that all this reflection doesn't have to be done for a filter on known model properties.

Lastly, borrowing again from Skenderi's code, the expression for the desired operation is created:

C#
private static Expression GetExpression<T>(Expression member, Filter filter)
{
  // How do we turn this into a SQL parameter 
  // so we're not susceptible to SQL injection attacks?
  // Like this: <a href="https://stackoverflow.com/a/71019524">
  // https://stackoverflow.com/a/71019524</a>
  //Expression<Func<object>> valueSelector = () => filter.Value;
  //var actualValue = valueSelector.Body;

  var actualValue = GetSelector<T>(filter);

  switch (filter.Operation)
  {
    case Op.Equals: return Expression.Equal(member, actualValue);
    case Op.GreaterThan: return Expression.GreaterThan(member, actualValue);
    case Op.GreaterThanOrEqual: 
         return Expression.GreaterThanOrEqual(member, actualValue);
    case Op.LessThan: return Expression.LessThan(member, actualValue);
    case Op.LessThanOrEqual: return Expression.LessThanOrEqual(member, actualValue);
    case Op.Contains: return Expression.Call(member, containsMethod, actualValue);
    case Op.StartsWith: return Expression.Call(member, startsWithMethod, actualValue);
    case Op.EndsWith: return Expression.Call(member, endsWithMethod, actualValue);
  }

  return null;
}

Logging the SQL and Parameter Values

Normally one would not do this, but as I wanted to see the SQL and the parameter values, the following is used to instantiate the database context:

C#
private MyContext GetContext()
{
  var builder = new DbContextOptionsBuilder<DbContext>().UseSqlServer(connectionString);
  builder.EnableSensitiveDataLogging();
  builder.LogTo(s => Debug.WriteLine(s), LogLevel.Information);
  var context = new MyContext(builder.Options);

  return context;
}

Conclusion

Hopefully, I have demonstrated a complete solution for dynamic IQueryable filtering that is not susceptible to SQL injection attacks, building on the work of three people smarter than me on how to do all this.

Appendix 1: Schema

SQL
USE [DynamicWhere]
GO
/****** Object: Table [dbo].[InventoryItems] Script Date: 4/2/2023 6:22:56 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[InventoryItems](
  [Id] [int] IDENTITY(1,1) NOT NULL,
  [Name] [nvarchar](255) NOT NULL,
CONSTRAINT [PK_InventoryItems] PRIMARY KEY CLUSTERED 
(
  [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, _
       IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, _
       ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
/****** Object: Table [dbo].[Sales] Script Date: 4/2/2023 6:22:56 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Sales](
  [Id] [int] IDENTITY(1,1) NOT NULL,
  [InventoryItem_Id] [int] NOT NULL,
  [SaleDate] [datetime] NOT NULL,
CONSTRAINT [PK_Sales] PRIMARY KEY CLUSTERED 
(
  [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, _
 IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, _
 OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Sales] WITH CHECK ADD CONSTRAINT [FK_Sales_InventoryItems] _
            FOREIGN KEY([InventoryItem_Id])
REFERENCES [dbo].[InventoryItems] ([Id])
GO
ALTER TABLE [dbo].[Sales] CHECK CONSTRAINT [FK_Sales_InventoryItems]
GO

Appendix 2: Caveat on Skenderi's Work

I came across this comment regarding case sensitivity in Skenderi, Build Where Caluses Dynamically in Linq, which I include here because it's useful to know this:

Contains, does not work if there is case sensitive data. So I made a slight change to contains, in case anyone needs it. Basically, use indexof:

C#
var pi = param.Type.GetProperty(filter.PropertyName);
var propertyAccess = Expression.MakeMemberAccess(param, pi);
var indexOf = Expression.Call(propertyAccess, 
              "IndexOf", null, Expression.Constant(constant.Value, typeof(string)), 
              Expression.Constant(StringComparison.InvariantCultureIgnoreCase));
return Expression.GreaterThanOrEqual(indexOf, Expression.Constant(0));

Because SQL does case-insensitive string comparisons by default, this is not an issue here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)