Introduction
When developing a complex line of business system queries reuse is often required. This article provides some guidelines for LINQ expressions reuse, and a utility that enables reuse of expressions in projection.
When looking for a way to reuse LINQ expressions for projections (LINQ Select()
calls) I came across this reply by Marc Gravell. I liked the use of the term "black art" for expression reuse so I reuse it here...
If you are only interested in using expressions in projections (LINQ Select()
calls), go here.
Prerequisites
This article assumes reasonable knowledge of LINQ.
Problem Domain
To demonstrate the goals of this article, let's assume a model containing projects and subprojects, represented by corresponding classes:
public class Project
{
public int ID { get; set; }
public virtual ICollection<Subproject> Subprojects { get; set; }
}
public class Subproject
{
public int ID { get; set; }
public int Area { get; set; }
public Project Project { get; set; }
}
Reusing selector expressions
Now, let's also assume there is some piece of logic that determines what subproject would be considered the "main" subproject for each project. Let's assume this is not trivial logic and it is being used across the application. Obviously, we would like to keep DRY and have this logic wrote in one place only. For performance reasons, we would like this logic to be available in a way that enables us to use it against the DB, we would like to avoid bringing in all subproject when we need only the main one.
With LINQ we can program that logic in a type-safe manner and in terms of our business objects.
Let's assume the logic for the main project is the project with the largest area, provided that it is not larger than 1,000. First, we will select all main subprojects, at this point we will include the selection logic in the query (kindly ignore the ignoring of non-trivial cases):
var mainsSbprojects = ctx.Subprojects
.Where(sp =>
sp.Area ==
sp.Project.Subprojects.Where(spi => spi.Area < 1000)
.Max(spii => spii.Area)).ToArray();
We would like to take the logic, inside the Where clause (the lambda) and reuse it across the application. If we use intellisense, we can learn that the expression is of type Expression<Func<Subproject, bool>>
:
Func<Subproject, bool>
means the parameter is expected to be a method that takes a Subproject
and returns a boolean
. Think of it as a loop that runs for each one of the subprojects and returns an indication whether it should be included in the result or not. The Expression
part means this is not quite a function but rater an expression tree that may be compiled into a method. However, this tree may be translated into SQL (or any other data retrieval equivalent), depending on the data source (If this is unclear to you, have a look at this).
Let's take that piece of lambda expression and put it into a variable:
private static Expression<Func<Subproject, bool>> mainSubprojectSelector = sp =>
sp.Area ==
sp.Project.Subprojects.Where(spi => spi.Area < 1000)
.Max(spii => spii.Area);
And let's now rephrase our query:
var mainsSbprojectsBySelector =
ctx.Subprojects.Where(mainSubprojectSelector).ToArray();
Now, let's assume we want the main subproject only for project 1:
var proj1MainSubProj = ctx.Subprojects.Where(
mainSubprojectSelector).Single(sp => sp.Project.ID == 1);
OK, that’s nice, we have reused our logic.
Going from the projects
Note that this expression starts the selection from the sub project. When we are dealing with a project it would make more sense to answer the question "What is this project's main subproject?". We would still want to use our original logic, so we might want to have a new expression that takes a project and returns a subproject, using the original logic. Maybe something like this:
private static Expression<Func<Project, Subproject>> projectMainSubprojSelector =
proj => proj.Subprojects.AsQueryable().Single(mainSubprojectSelector);
And now we can do this:
var proj1MainSubprojByProj = ctx.Projects.Where(p => p.ID == 1).Select(projectMainSubprojSelector).Single();
(This would run fine in LINQ to objects, however, if you try to run it with LINQ to entities you would run into an error stating that the method Single()
can only be the last method in the chain. This is true for SingleOrDefault()
and for First
, but does not apply for FirstOrDefault()
.)
Let's have another look on that selection expression:
var proj1MainSubprojByProj = ctx.Projects.Where(p => p.ID == 1).Select(projectMainSubprojSelector).Single();
Note that you might think of using DbSet.Find()
or DbSet.Single()
to get the Project with ID==1
but you would not be able to call Select()
on it, therefore, it would be impossible to use the main subproject selection logic. We must keep running an IQueryable<Project>
down the chain to reuse the logic.
Projecting
Let's introduce a new requirement – we now have a logic that retrieves the average effective area (AEA) of a project. That would be the average of the area of all subproject, excluding projects with an area greater than 1,000. Here is an expression for you (to be reused in a DRY environment):
private static Expression<Func<Project, double>> projectAverageEffectiveAreaSelector =
proj => proj.Subprojects.Where(sp => sp.Area < 1000).Average(sp => sp.Area);
And here is how to get the AEA of a project:
var proj1Aea =
ctx.Projects.Where(p => p.ID == 1)
.Select(projectAverageEffectiveAreaSelector)
.Single();
Now, assume we want to retrieve the project ID AND it's AEA. As the AEA selector is an expression we would like to do something like this:
var proj1AndMainSubprojByProj =
ctx.Projects.Where(p => p.ID == 1)
.Select(p => new
{
ProjID = p.ID,
AEA = projectAverageEffectiveAreaSelector
})
.Single();
Well, no. This is a very interesting point with the way the compiler treats lambda expressions. The variable projectAverageEffectiveAreaSelector
is of type Expression<Func<Project, double>>
. When you write an assignment in an anonymous type initialization, the compiler creates a property of the type you are assigning into it. We want the property AEA
to be of type double
but is going to be of type Expression<Func<Project, double>>
. The compiler has no idea we want to bring in the expression and merge it into the LINQ query, so that we would now have a double (using a pre-defined type instead of an anonymous type will not be any better - the assignment would just not build).
Introducing LinqExpressionProjection
The LinqExpressionProjection library has the sole purpose of allowing you to project from expressions not part of the LINQ query.
Using the code
All you need to do is:
- Add reference to LinqExpressionProjection.dll (it is best to use the nugget package)
- Call
AsExpressionProjectable()
on the queried collection - Call
Project<T>()
on the code element (variable, property, method etc..) that retrieves the expression with the type parameter of the expected result type. Remember, the expression must be of type Expression<Func<TIn, TOut>>
where the TIn
type is the type of the select lambda parameter and TOut
is the type of the property you are setting, and the type parameter of the Project<T>()
call
This is how it would be done for the above example:
var proj1AndAea =
ctx.Projects.AsExpressionProjectable()
.Where(p => p.ID == 1)
.Select(p => new
{
Project = p,
AEA = Utilities.projectAverageEffectiveAreaSelector.Project<double>()
}
).Single();
That's it.
You can download it here or you can (and probably better) use the nuget package.
How it works
This section can be safely skipped if you "just want it to work". However, note there is some more discussion of expressions reuse at the end of the article.
Part a – Replacing Project<T>()
calls
The call to AsExpressionProjectable()
wrapps the IQueryable
with another IQueryable
that is in charge of visiting the expression tree and replacing the calls to Project<T>()
.
When a call to Project<T>()
is visited, the method call argument is compiled and executed . As this is an extension method, that is the part where you retrieve the reusable expression of type Expression<Func<TIn, TOut>>
. The argument is compiled and executed, not analyzed or interoperated. This means any piece code that has the right return type can fit there, including parameterized method calls.
The body of resulting lambda expression is then visited and inserted into the expression tree, replacing both the call to Project<T>()
and it's parameter.
Expression tree visiting in LinqExpressionProjection is based on the popular expression trees rewrite pattern and is based on LinqKit – by the grate Joseph Elbahary (If you do not own LINQPad you really should), which in turn is based on Tomas Petricek's work. All of the above, and this article make use of the, now industry-standard, ExpressionVisitor
by Matt Warren (Matt has a series of blog posts called "LINQ: Building an IQueryable Provider" where he builds the "IQToolkit". Read this if you ever want to get a glimpse of genuine genius).
Part b – Rebinding parameters
There are two parameter in play here. One is the parameter of the projection lambda (the Select()
method call), and the other one is the parameter of the lambda form reusable expression. These parameters are expected to be of the same type (and the code validates that) but the projection parameter should replace the reusable expression parameter in order for the operation to be successful.
Parameters rebinding is achieved by visiting the relevant part of the tree and replacing the occurrences. The visitor is credited to Colin Meek, and it is also based on Matt Warren's work mentioned above.
The project is open source and can be found on CodePlex
Points of interest
Query style – A note
There are two flavors of LINQ – query and methods chaining. If you are versed in the query syntax, consider the fact method chaining lend itself to expressions reuse while query syntax doesn't. Actually, using tools like the ones described here you might get it to work in query syntax but it would make you queries much less readable, losing most of the benefit of query syntax. You are also bound to lose the help of intellisense in understanding the expression tree and the selector expressions you are expected to provide. If you plan for massive LINQ reuse, aim for the method syntax or for mixed one.
LINQ Expression reuse considerations
Some things can be achieved by retrieving data and processing it or by creating more complex data retrieval that also includes some of the processing. In the world of LINQ, where query itself is type safe, testable, and described in problem domain terms, it is natural to want to shift attitude towards doing more in the query. While this is generally good, and expressions reuse is a supporting tool for that practice, please consider the fact that LINQ is harder to understand when compared to procedural code. It is also generally harder to debug, and much harder when you are not querying objects but rather using an ORM to compose store queries.
Think ahead when planning your approach.
History
- June 2012: First released.