Introduction
In the previous article, we introduced IQueryable
. That article was narrowly focused on introducing the reader to a few basic concepts:
- The extension methods in
System.Linq.Queryable
operate on instances of IQueryable
, by building an expression tree (Expression
). IQueryable
simply pairs the expression tree (Expression
) with a query provider (IQueryProvider
). - The query provider (
IQueryProvider
) is responsible for interpreting the expression tree, executing the query, and fetching the results. - The query provider (
IQueryProvider
) may limit what can appear in an expression tree.
To explain this, we provided some examples of expression trees. By intent, the article was intentionally vague on how expression trees get built in the first place.
This article will dig slightly deeper into that topic. It will demonstrate a clear contrast between the extension methods in System.Linq.Enumerable
and System.Linq.Queryable
. It will demonstrate how some simple expression trees are actually produced and consumed.
Background
Because this article is quite short, I originally considered including it as a part of LINQ Part 3: An Introduction to IQueryable. However, I was concerned that this additional detail might distract from the basic concepts in that article. This article assumes you have already read the previous article.
This is the fourth in a series of articles on LINQ. Links to other articles in this series are as follows:
Getting Ready
In the source included with this article, we examine two different implementations of the Where
method. To do this, we first create both an IEnumerable
and an IQueryable
instance.
var things = new[] { "Red Apple", "Green Apple", "Red Balloon", "Green Balloon" };
var enumerable = things.AsEnumerable();
var queryable = things.AsQueryable();
So we can see take a closer look at the functionality, the source includes an approximate equivalent to the standard System.Linq.Enumerable.Where
and System.Linq.Queryable.Where
methods: MyEnumerable.MyWhere
and MyQueryable.MyWhere
.
Syntactically, the calls to these two methods are identical:
enumerable = enumerable.MyWhere(item => item.StartsWith("Red"));
queryable = queryable.MyWhere(item => item.StartsWith("Red"));
However, they result in two entirely different method calls. Since enumerable
is an instance of IEnumerable
, it will call the MyEnumerable.MyWhere
method.
In contrast, queryable
is an instance of IQueryable
, and will call the MyQueryable.MyWhere
method.
Help From the Compiler
The first surprising thing we'll note is that these two methods have entirely different parameter types for the predicate.
IEnumerable<T> MyWhere<T>(this IEnumerable<T> source, Func<T, bool> predicate);
IQueryable<T> MyWhere<T>(this IQueryable<T> source, Expression<Func<T, bool>> predicate);
How is it possible that we can (apparently) pass the exact same value for two different parameter types? The short answer is that we can't.
This subtle difference in method signature actually triggers two very different behaviors in the compiler.
In the first case (Func<T, bool>
), the compiler simply constructs a delegate for the predicate method. This is what is passed to the MyEnumerable.MyWhere
method.
In the second case (Expression<Func<T, bool>>
), the compiler constructs an entire expression tree, on your behalf, and passes this expression tree as a parameter to the MyQueryable.MyWhere
method. Visually, the expression tree it constructs (and passes) appears as follows:
Same Name - Very Different Methods
We have two very different implementations of the MyWhere
method: MyEnumerable.MyWhere
and MyQueryable.MyWhere
. Let's take a quick look at how they work.
MyEnumerable.MyWhere
In MyEnumerable.MyWhere
, we operate directly upon the original IEnumerable
. The logic here is very simple.
We create a new instance of IEnumerable
that only includes items, from the original IEnumerable
, which match the predicate condition. The standard Enumerable.Where
method provides similar functionality, but also includes some performance optimizations.
public static IEnumerable<T> MyWhere<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
if (source == null)
throw new ArgumentNullException(nameof(source));
if (predicate == null)
throw new ArgumentNullException(nameof(predicate));
foreach (T item in source)
if (predicate(item))
yield return item;
}
MyQueryable.MyWhere
In MyQueryable.MyWhere
, we do something very different. We create a new instance of IQueryable
that has a slightly altered version of the original expression tree (passed in as a parameter).
Principally, we wrap the original expression in a Method
CallExpression
for the standard Queryable.Where
method. The standard Queryable.Where
method provides similar functionality, by wrapping the original expression in a self-referencing Method
CallExpression
.
public static IQueryable<T> MyWhere<T>(this IQueryable<T> source, Expression<Func<T, bool>> predicate)
{
if (source == null)
throw new ArgumentNullException(nameof(source));
if (predicate == null)
throw new ArgumentNullException(nameof(predicate));
MethodInfo whereMethodInfo = GetMethodInfo<T>((s, p) => Queryable.Where(s, p));
var callArguments = new[] { source.Expression, Expression.Quote(predicate) };
var callToWhere = Expression.Call(null, whereMethodInfo, callArguments);
return source.Provider.CreateQuery<T>(callToWhere);
}
The new expression tree visually appears as follows:
Not a True Method Call
An astute observer may wonder why the Queryable.Where
method would wrap the expression in a self-referencing MethodExpressionCall
. This would seem to be a recipe for infinite recursion / stack overflows.
In this behavior, Queryable.Where
is far from alone. Most of the extension methods in the Queryable
class have a similar implementation.
So, what's going on here? Why don't we see infinite recursion / stack overflow?
The fact is that no IQueryProvider
, worth its salt, is every going to truly call these methods. Think of these MethodCallExpression
nodes as an advertisement of intent.
With the MethodExpressionCall
node referencing the Queryable.Where
method, we're simply informing the query provider that it should limit the results to those matching the predicate condition. We do not expect it will actually call this method.
At the time of iteration, the query provider (IQueryProvider
) is responsible for interpreting the expression tree, executing the query, and fetching the results.
As demonstrated in this code, it is possible to write your own IQueryable
extension methods. However, to insure that any IQueryProvider
can interpret your expression tree, it is important that you limit its contents to those that might appear in an expression tree created by one of the standard methods.
Let's consider two common query providers: LINQ to Objects and LINQ to SQL.
LINQ to Objects (System.Linq.EnumerableQuery)
In our example, we create our IQueryable
via a call to the Queryable.AsQueryable
method. This creates an instance of System.Linq.EnumerableQuery
. This class is really only a thin wrapper around the LINQ to Objects extension methods (in System.Linq.Enumerable
).
Note: The Queryable.EnumerableQuery
class implements both IQueryable
and IQueryProvider
. So, its doing double duty here: as both a queryable and a query provider.
When we begin to iterate this queryable, we make a call to its IQueryable<T>.GetEnumerator
method. This method rewrites the expression tree, before compiling and executing it.
Along with other changes, it finds all MethodCallExpression
nodes, in the tree, that reference a method declared in the System.Linq.Queryable
class. It then substitutes these nodes for MethodCallExpression
nodes that reference equivalent methods in the System.Linq.Enumerable
class.
So, in our example, Queryable.Where
becomes Enumerable.Where
.
We're not really calling the Queryable.Where
method referenced in the original expression tree. Instead, we're calling the equivalent Enumerable.Where
method referenced in the re-written tree.
LINQ to SQL (System.Data.Linq.Table)
With LINQ to SQL, something very different occurs. We create an instance of the System.Data.Linq.Table
class. This class is remarkably complex.
Note: System.Data.Linq.Table erves as both the IQueryable
and IQueryProvider
for LINQ to SQL.
When we begin to iterate this queryable, a lot happens. In order to explain it, we'll be leaving out much of the fine detail and instead discussing it at a high level.
Basically, this provider visits each of the nodes in the expression tree, so that it can create the text for a complete SQL command (e.g. SELECT * FROM Items WHERE Color = 'Red'
).
In this case, the expression tree might include a MethodCallExpression
for the Queryable.Where
method. This method call would be translated into the text for a SQL WHERE
clause (e.g. WHERE Color = 'Red'
).
Once again, we're not actually calling the Queryable.Where
method, referenced in the original expression tree. Instead, that MethodCallExpression
simply provides information that is formatted into text.
When the complete SQL command is formatted, the query provider simply uses ADO's DbConnection
/ DbCommand
to execute it. It then returns an IEnumerator
that iterates over the result set.
As it iterates, it creates appropriate instances and sets instance properties to their corresponding column values.
Of course, this is an over-simplification. Though, it does provide a taste of the high level functionality for this complex provider.
Same Name - Same Results
In our example source code, when we iterate either enumerable
or queryable
, we get the same results:
WriteItems(enumerable);
WriteItems(queryable);
To prove that we are truly dealing with an expression tree, in the case of queryable
, we can simply examine its Expression
property. Note: The source includes a simple derivation of System.Linq.Expressions.ExpressionVisitor
(to dump the expression tree to the console).
WriteExpression(queryable.Expression);
Summary
From this article, we should now understand the following about most methods in the System.Linq.Queryable
class:
- The C# compiler actually does a lot of the work. It creates expression trees from Lambda expressions. The expression trees are then passed, as parameters, to the relevant method.
- Most of the methods simply wrap the original expression tree in a self-referencing
MethodExpressionCall
. They then create a new IQueryable
instance that references the resulting expression tree. - The self-referencing
MethodExpressionCall
, in the created expression tree, is never actually called. Instead, it merely advertises intent to the query provider (IQueryProvider
). - The query provider acts upon this advertised intent. In some cases, it may translate the intent into an equivalent method call (e.g.
Enumerable.Where
). In others, it may translate the intent into the text for some query language (e.g. SQL).
History
- 4/25/2018 - The original version was uploaded