Introduction
The v3.5 release of the .NET framework includes a significant number of new and enhanced technologies. LINQ (Language Integrated Query) is, in my opinion, the most significant new technology in the v3.5 release. Microsoft has implemented a set of libraries to transform LINQ expression trees to SQL statements and dubbed these DLINQ. DLINQ is a very impressive piece of work, but unfortunately, it is available only for SQL Server 2000 and 2005.
Background
The objective of this article (and further ones to follow) is to demonstrate how to transform LINQ expression trees to SQL statements that can be executed against multiple RDBMS systems and not just Microsoft's SQL Server offerings. I am aware of at least one other set of excellent articles, notably this one at the WaywardWeblog, that demonstrates how to perform this transformation. I have made use of two components introduced in the WaywardWeblog articles, namely the partial evaluator and the expression tree walker. However, the last time I checked, the articles did not demonstrate how to:
- Correctly and comprehensively translate binary and unary expressions that have valid translations into SQL.
- Translate function calls (e.g.,
customer.FirstName.ToUpper()
) that have SQL equivalents. - Implement
GroupBy
. - Implement the
IQueryable
methods ANY
, ALL
, COUNT
, AVERAGE
etc. - Parameterize queries instead of embedding constants in the SQL transformation.
- Perform caching of previously translated expression trees.
- Potentially not make use of MARS.
In addition, I wanted to perform the translations in the simplest and most directly possible fashion (this is, of course, a matter of subjective taste to some extent, but I hope that you will agree with me after you've read the articles). You will, therefore, find significant differences in the approach taken here and those you might find elsewhere.
The Binder
Creating a LINQ to SQL translator is a non-trivial task, too lengthy to cover in one article. In this article, I will, therefore, discuss just one class used in my implementation - the Binder
. This class will illustrate a number of interesting concepts, but will remain accessible enough so as not to "drown" the reader.
The Binder
is a class that takes a DbDataReader
and assigns values in that reader to a newly instantiated object of a given class.
I can hear some of you chocking, so here's an example to help you wash the concept down.
Suppose we have a LINQ query that looks like so:
var customers = from customer in customers
where customer.City == city
select new { Name = customer.ContactName,
Phone = customer.Phone };
This will translate into the following SQL statement:
SELECT t0.ContactName, t0.Phone
FROM dbo.Customers AS t0
WHERE (t0.City = @p0)
We will then need to create a command, populate the parameter collection accordingly (i.e., supply a value for the city
parameter in this case), and then execute the command and retrieve a DbDataReader
with two fields: ContactName
and Phone
.
The Binder
will then be responsible for creating an anonymous type with two properties, Name
and Phone
, to which we shall assign the values ContactName
and Phone
, respectively, retrieved from the DbDataReader
.
The Gory Details
The LINQ query above will produce the following expression:
.Where(customer => (customer.City = value(LinqTest.NorthwindLinq+<>c__DisplayClass1).city))
.Select(customer => new <>f__AnonymousType0`2(Name = customer.ContactName,
Phone = customer.Phone))
For the purposes of the Binder
, we are interested only in the following Lambda expression:
customer => new <>f__AnonymousType0`2(Name = customer.ContactName,
Phone = customer.Phone)
Which means essentially:
- Given a parameter called
customer
of type Customer
:
customer => new <>f__AnonymousType0`2(Name = customer.ContactName,
Phone = customer.Phone)
- Create an instance of type
<>f__AnonymousType0`2
:
customer => new <>f__AnonymousType0`2(Name = customer.ContactName,
Phone = customer.Phone)
- And, while you’re at it, assign the value of
customer.ContactName
to Name
and customer.Phone
to Phone
:
customer => new <>f__AnonymousType0`2(Name = customer.ContactName,
Phone = customer.Phone)
- You're done.
Sounds simple enough, but as usual, the devil is in the details.
You will recall from above that for the purposes of the Binder
, we get a DbDataReader
with two fields: ContactName
and Phone
, i.e., we do not have a parameter called customer
of type type Customer
with two properties ContactName
and Phone
. So, what to do?
I've been told the shortest path between two points is a straight line, so why not change the Lambda Expression above so that it picks its values from a parameter called reader
of type DbDataReader
?
In other words, we want to turn this:
customer => new <>f__AnonymousType0`2(Name = customer.ContactName,
Phone = customer.Phone)
into this:
reader => new <>f__AnonymousType0`2(Name = reader.GetString(0),
Phone = reader.GetString(1))
If you’re asking yourself whether we're done yet, the answer is No.
We have three problems (at least):
- How do we know that
reader.GetString(0)
gets us the ContactName
? - How do we know that we should call
reader.GetString(0)
and not reader.GetInt16(0)
or any of the other reader.Getxxx
methods? - What happens if we call
reader.Getxxx
and the value is null
? (Answer: you'll get an error if you do.)
It turns out that the third problem is the easiest to solve. We want a Lambda expression that looks like so:
reader => new <>f__AnonymousType0`2(Name = IIF(Not(reader.IsDBNull(0)),
reader.GetString(0), Convert(null)),
Phone = IIF(Not(reader.IsDBNull(1)),
reader.GetString(1), Convert(null)))
That’s quite a mouthful, but all we're saying in the end is:
IF NOT reader.IsDBNull(0)) Then
Name = reader.GetString(0)
ELSE
Name = NULL
END IF
and likewise for phone
.
Unfortunately, in order to resolve the first two problems, we must dive deeper into the code, and that will be the subject of the next article. For those of you who can't wait, here is the complete listing for the Binder
class:
private class Binder : ExpressionVisitor {
private readonly LambdaExpression selector = null;
private readonly LambdaExpression binderLambda = null;
private readonly Delegate binderMethod = null;
private readonly Dictionary<string,> columnPositions = new Dictionary<string,>();
private readonly ParameterExpression reader =
Expression.Parameter(typeof(DbDataReader), "reader");
private static readonly MethodInfo getBoolean =
typeof(DbDataReader).GetMethod("GetBoolean");
private static readonly MethodInfo getByte =
typeof(DbDataReader).GetMethod("GetByte");
private static readonly MethodInfo getChar =
typeof(DbDataReader).GetMethod("GetChar");
private static readonly MethodInfo getDateTime =
typeof(DbDataReader).GetMethod("GetDateTime");
private static readonly MethodInfo getDecimal =
typeof(DbDataReader).GetMethod("GetDecimal");
private static readonly MethodInfo getDouble =
typeof(DbDataReader).GetMethod("GetDouble");
private static readonly MethodInfo getGUID =
typeof(DbDataReader).GetMethod("GetGuid");
private static readonly MethodInfo getInt16 =
typeof(DbDataReader).GetMethod("GetInt16");
private static readonly MethodInfo getInt32 =
typeof(DbDataReader).GetMethod("GetInt32");
private static readonly MethodInfo getInt64 =
typeof(DbDataReader).GetMethod("GetInt64");
private static readonly MethodInfo getString =
typeof(DbDataReader).GetMethod("GetString");
private static readonly MethodInfo getValue =
typeof(DbDataReader).GetMethod("GetValue");
public Delegate BinderMethod {
get {
return binderMethod;
}
}
public Binder(LambdaExpression selector) {
this.selector = selector;
if (selector.Body.NodeType != ExpressionType.Parameter) {
binderLambda = Expression.Lambda(((LambdaExpression)this.Visit(selector)).Body,
reader);
}
else {
binderLambda = GetBindingLambda(selector);
}
binderMethod = binderLambda.Compile();
}
protected override Expression VisitMethodCall(MethodCallExpression m) {
switch (m.Method.Name) {
case "Count":
case "Average":
case "Max":
case "Min":
case "Sum":
break;
default:
return base.VisitMethodCall(m);
}
Debug.Assert(m.Arguments.Count > 0);
Debug.Assert(m.Arguments[0].NodeType == ExpressionType.MemberAccess);
if (GetAccessedType(m.Arguments[0] as MemberExpression) !=
selector.Parameters[0].Type) {
return m;
}
int columnPosition = GetColumnPosition(m.ToString());
return GetColumnReader(m, columnPosition);
}
protected override Expression VisitMemberAccess(MemberExpression m) {
Debug.Assert(selector.Parameters.Count == 1);
if (GetAccessedType(m) != selector.Parameters[0].Type) {
return m;
}
int columnPosition = GetColumnPosition(m);
return GetColumnReader(m, columnPosition);
}
private Expression GetColumnReader(Expression m, int columnPosition) {
var column = Expression.Constant(columnPosition, typeof(int));
var callExpression = GetCallMethod(m, column);
var isDbNull = Expression.Call(reader,
typeof(DbDataReader).GetMethod("IsDBNull"),
column);
var conditionalExpression =
Expression.Condition(Expression.Not(isDbNull),
callExpression,
Expression.Convert(Expression.Constant(null),
callExpression.Type));
return conditionalExpression;
}
private static Type GetAccessedType(MemberExpression m) {
if (m.Expression.NodeType == ExpressionType.MemberAccess) {
return GetAccessedType((MemberExpression)m.Expression);
}
return m.Expression.Type;
}
private Expression GetCallMethod(Expression m, ConstantExpression column) {
MethodInfo getMethod = GetGetMethod(m);
var callMethod = Expression.Call(reader, getMethod, column);
if (getMethod.ReturnType == m.Type) {
return callMethod;
}
return Expression.Convert(callMethod, m.Type);
}
private int GetColumnPosition(MemberExpression m) {
return GetColumnPosition(m.Member.Name);
}
private int GetColumnPosition(string columnName) {
int columnPosition = 0;
if (columnPositions.ContainsKey(columnName)) {
columnPosition = columnPositions[columnName];
return columnPosition;
}
columnPosition = columnPositions.Count();
columnPositions.Add(columnName, columnPosition);
return columnPosition;
}
private static MethodInfo GetGetMethod(Expression m) {
Type memberType = GetMemberType(m);
MethodInfo getMethod = null;
switch (Type.GetTypeCode(memberType)) {
case TypeCode.Boolean:
getMethod = getBoolean;
break;
case TypeCode.Byte:
getMethod = getByte;
break;
case TypeCode.Char:
getMethod = getChar;
break;
case TypeCode.DateTime:
getMethod = getDateTime;
break;
case TypeCode.Decimal:
getMethod = getDecimal;
break;
case TypeCode.Double:
getMethod = getDouble;
break;
case TypeCode.Int16:
getMethod = getInt16;
break;
case TypeCode.Int32:
getMethod = getInt32;
break;
case TypeCode.Int64:
getMethod = getInt64;
break;
case TypeCode.String:
getMethod = getString;
break;
case TypeCode.Object:
getMethod = getValue;
break;
default:
if (m.Type == typeof(Guid)) {
getMethod = getGUID;
}
else {
getMethod = getValue;
}
break;
}
return getMethod;
}
private static Type GetMemberType(Expression m) {
Type memberType = null;
if (m.Type.Name == "Nullable`1") {
memberType = m.Type.GetGenericArguments()[0];
}
else {
memberType = m.Type;
}
return memberType;
}
private LambdaExpression GetBindingLambda(LambdaExpression selector) {
var instanceType = selector.Body.Type;
var properties = (from property in instanceType.GetProperties()
where property.PropertyType.IsValueType ||
property.PropertyType == typeof(string)
orderby property.Name
select instanceType.GetField("_" + property.Name,
BindingFlags.Instance |
BindingFlags.NonPublic))
.ToArray();
var bindings = new MemberBinding[properties.Length];
for (int i = 0; i < properties.Length; i++) {
var callMethod = GetColumnReader(
Expression.MakeMemberAccess(
Expression.Parameter(instanceType, "param"),
properties[i]),
i);
bindings[i] = Expression.Bind(properties[i], callMethod);
}
return Expression.Lambda(Expression.MemberInit(Expression.New(instanceType),
bindings),
reader);
}
}
Note
My internet connection is as slow as treacle, so I'll post the project for the full LINQ to SQL IQueryable
Provider later.