Introduction
The more I use it, the more I learn about the Entity Framework and LINQ to Entities. I started writing stored procedures long before I ever started using Object/Relational Mapping (O/RM) tools like the Entity Framework, so sometimes I play with getting my LINQ code to generate the exact query syntax I want. In this article, I share some of the patterns I've discovered.
The Pattern and the Problem
Many of you are familiar with the database structure presented here – it’s a fairly standard pattern that many developers have implemented in some form over the course of their careers. My implementation consists of a Person
table, a set of logon names in a PersonLogonName
table, and a User
table. User
s may or may not be Person
s, and Person
s may or may not be User
s.
Although the User
table and the PersonLogonName
table both store a logon name, the User
table is only concerned with the logon name currently used to access the system while the PersonLogonName
table is more concerned with associating a history of logon names with a particular person. The User
table is part of a code module that handles settings such as how many rows the user prefers to display on a certain grid, while the Person
table actually associates people (who may or may not be users) with other system objects. Since the two tables are used for entirely different purposes, and since a logon name may exist in either table while not in the other, a foreign key does not exist. This lack of a foreign key will become important later.
The goal is to retrieve a list of active logon names for use during data entry. I decided to start with active Person
s, which is easy:
SELECT PersonLogonName
FROM Person INNER JOIN PersonLogonName
ON Person.PersonId = PersonLogonName.PersonId
WHERE Person.IsActive = 1
Creating this query with LINQ to Entities is also relatively easy. Here is one way to write this code after generating a data model based on the database schema (and pluralizing entity names):
var query = entities.People
.Include(
personEntity => personEntity.PeopleToLogonNames)
.Where(personEntity => personEntity.IsActive);
Working the Problem
Although simple, the query above was returning the entire history of logon names when all I wanted was the latest, active, non-expired logon name. Retrieving one-out-of-many child objects is a Top per Group pattern, which is something I've implemented in the past with a SELECT MAX
in a subquery. SQL Server 2005 introduced the CROSS APPLY
operator, which is a great tool for implementing this pattern when coupled with an ordered SELECT TOP 1
subquery like the following:
SELECT PersonLogonName
FROM Person
CROSS APPLY (SELECT TOP (1) PersonLogonName
FROM PersonLogonName
WHERE Person.PersonId = PersonLogonName.PersonId
AND PersonLogonName.ActiveDate <= SysDateTime()
AND (PersonLogonName.ExpirationDate IS NULL
OR PersonLogonName.ExpirationDate > SysDateTime())
ORDER BY PersonLogonName.ActiveDate DESC) AS LatestLogonName
WHERE Person.IsActive = 1
I originally thought the LINQ to Entities version would be straight-forward and started with this simple projection technique:
var query = entities.People
.Where(personEntity => personEntity.IsActive)
.Select(personEntity => personEntity.PeopleToLogonNames
.Where(personLogonNameEntity =>
personLogonNameEntity.ActiveDate <= DateTime.Now
&& (personLogonNameEntity.ExpirationDate == null
|| personLogonNameEntity.ExpirationDate > DateTime.Now))
.OrderByDescending(personLogonNameEntity => personLogonNameEntity.ActiveDate)
.FirstOrDefault());
This resulted in the generation of an OUTER APPLY
, which gave me some NULL
logon names (not all Person
s have logon names in my system). What I needed was a CROSS APPLY
, and the only way I could get LINQ to Entities generate a CROSS APPLY
operator was to add an additional WHERE
clause, like this:
var query = entities.People
.Where(personEntity => personEntity.IsActive)
.Select(personEntity => personEntity.PeopleToLogonNames
.Where(personLogonNameEntity =>
personLogonNameEntity.ActiveDate <= DateTime.Now
&& (personLogonNameEntity.ExpirationDate == null
|| personLogonNameEntity.ExpirationDate > DateTime.Now))
.OrderByDescending(personLogonNameEntity => personLogonNameEntity.ActiveDate)
.FirstOrDefault())
.Where(entity => entity.LogonName != null);
This may seem strange, but the Entity Framework is actually intelligent enough to know that the best way to eliminate NULL
s from your results is to use a CROSS APPLY
in lieu of an OUTER APPLY
. EF would still render the WHERE
clause in the generated SQL, but it is ignored by the query processor when the query is compiled.
Next, I needed to include data from the User
table, which presented its own problems. As previously mentioned, there is no foreign key between the User
and PersonLogonName
tables, which precludes the use of the Include
method or any Navigation Properties the Entity Framework automatically generates. Also, the Entity Framework doesn’t support the T-SQL FULL OUTER JOIN
operator (which is understandable, since not all database systems include support for this). So I decided to use a UNION
operator, like this:
var query = entities.People
.Where(personEntity => personEntity.IsActive)
.Select(personEntity => personEntity.PeopleToLogonNames
.Where(personLogonNameEntity =>
personLogonNameEntity.ActiveDate <= DateTime.Now
&& (personLogonNameEntity.ExpirationDate == null
|| personLogonNameEntity.ExpirationDate > DateTime.Now))
.OrderByDescending(personLogonNameEntity => personLogonNameEntity.ActiveDate)
.FirstOrDefault())
.Where(entity => entity.LogonName != null)
.Select(entity => new { entity.LogonName })
.Union(entities.Users
.Select(userEntity => new { userEntity.LogonName }))
.OrderBy(entity => entity.LogonName);
Notice the use of Select
projections to limit the column lists in both result sets and ensure the lists are identical. Failing to do this will result in compilation errors. This worked well, but caused logon names marked as expired in the PersonLogonName
table to show when they also existed in the User
table. I needed a way to filter them out, and I figured a LEFT OUTER JOIN
on the second SELECT
of my UNION
was the best way to accomplish this. So I was looking to generate SQL similar to the following:
SELECT PersonLogonName
FROM Person
CROSS APPLY (SELECT TOP (1) PersonLogonName
FROM PersonLogonName
WHERE Person.PersonId = PersonLogonName.PersonId
AND PersonLogonName.ActiveDate <= SysDateTime()
AND (PersonLogonName.ExpirationDate IS NULL
OR PersonLogonName.ExpirationDate > SysDateTime())
ORDER BY PersonLogonName.ActiveDate DESC) AS LatestLogonName
WHERE Person.IsActive = 1
UNION
SELECT [User].UserLogonName
FROM [User] LEFT OUTER JOIN PersonLogonName
ON [User].UserLogonName = PersonLogonName.PersonLogonName
WHERE PersonLogonName.ExpirationDate IS NULL
OR PersonLogonName.ExpirationDate > SysDateTime()
ORDER BY PersonLogonName
Since there were no Navigation Properties available, I had no choice but to use the Join
method in LINQ to Entities. I originally tried the following:
var query = entities.People
.Where(personEntity => personEntity.IsActive)
.Select(personEntity => personEntity.PeopleToLogonNames
.Where(personLogonNameEntity =>
personLogonNameEntity.ActiveDate <= DateTime.Now
&& (personLogonNameEntity.ExpirationDate == null
|| personLogonNameEntity.ExpirationDate > DateTime.Now))
.OrderByDescending(personLogonNameEntity => personLogonNameEntity.ActiveDate)
.FirstOrDefault())
.Where(entity => entity.LogonName != null)
.Select(entity => new { entity.LogonName })
.Union(entities.Users
.Join(entities.PersonLogonNames,
outerEntity => outerEntity.LogonName,
innerEntity => innerEntity.LogonName,
(userEntity, innerEntity) => new
{
LogonName = userEntity.LogonName,
ExpirationDate = innerEntity.ExpirationDate
})
.Where(entity => entity.ExpirationDate == null
|| entity.ExpirationDate > DateTime.Now)
.Select(entity => new { entity.LogonName }))
.OrderBy(entity => entity.LogonName);
Notice the addition of the Join
method. The table being joined and both the operands to be used in the generated ON
clause are represented in the first three parameters. The fourth parameter specifies the structure of the result set, so this method also performs a projection. Looks perfect, but the SQL generated would always be an INNER JOIN
, not a LEFT OUTER JOIN
.
The Final Solution
I tried modifying the above code with many different combinations of DefaultIfEmpty
method calls and NULL
comparisons, but could not persuade the Entity Framework to generate a LEFT OUTER JOIN
. I researched this behavior, found others experiencing the same problem, and saw that the GroupJoin
method would produce the LEFT OUTER JOIN
that I desired. Unfortunately, the purpose of the GroupJoin
method is to create result sets with multiple child records in the form of a parent-to-children tree, so I needed to flatten this resulting set with a call to the SelectMany
method. Here is the final code:
var query = entities.People
.Where(personEntity => personEntity.IsActive)
.Select(personEntity => personEntity.PeopleToLogonNames
.Where(personLogonNameEntity =>
personLogonNameEntity.ActiveDate <= DateTime.Now
&& (personLogonNameEntity.ExpirationDate == null
|| personLogonNameEntity.ExpirationDate > DateTime.Now))
.OrderByDescending(personLogonNameEntity => personLogonNameEntity.ActiveDate)
.FirstOrDefault())
.Where(entity => entity.LogonName != null)
.Select(entity => new { entity.LogonName })
.Union(entities.Users
.GroupJoin(entities.PersonLogonNames,
outerEntity => outerEntity.LogonName,
innerEntity => innerEntity.LogonName,
(innerEntity, outerEntities) => new
{
LogonName = innerEntity.LogonName,
PersonLogonNames = outerEntities
})
.SelectMany(personLogonNameEntities =>
personLogonNameEntities.PersonLogonNames
.DefaultIfEmpty(),
(combinedUserEntity, personLogonNameEntity) => new
{
combinedUserEntity.LogonName,
personLogonNameEntity.ExpirationDate
})
.Where(combinedUserEntity =>
combinedUserEntity.ExpirationDate == null
|| combinedUserEntity.ExpirationDate > DateTime.Now)
.Select(combinedUserEntity => new { combinedUserEntity.LogonName }))
.OrderBy(entity => entity.LogonName);
I admit, that looks like a lot of complicated code for something so simple, but the SQL it generates is exactly what I originally intended and executes very quickly. Here is the generated SQL:
SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[C2] AS [C2]
FROM (SELECT DISTINCT
[UnionAll1].[C1] AS [C1],
[UnionAll1].[PersonLogonName] AS [C2]
FROM (SELECT
1 AS [C1],
[Limit1].[PersonLogonName] AS [PersonLogonName]
FROM (SELECT
[Extent1].[PersonId] AS [PersonId]
FROM [dbo].[Person] AS [Extent1]
WHERE [Extent1].[IsActive] = 1 ) AS [Filter1]
CROSS APPLY (SELECT TOP (1)
[Project1].[PersonLogonName] AS [PersonLogonName]
FROM (SELECT
[Extent2].[PersonLogonName] AS [PersonLogonName],
[Extent2].[ActiveDate] AS [ActiveDate]
FROM [dbo].[PersonLogonName] AS [Extent2]
WHERE ([Filter1].[PersonId] = [Extent2].[PersonId])
AND ([Extent2].[ActiveDate] <= SysDateTime())
AND (([Extent2].[ExpirationDate] IS NULL)
OR ([Extent2].[ExpirationDate] > SysDateTime()))
) AS [Project1]
ORDER BY [Project1].[ActiveDate] DESC) AS [Limit1]
WHERE [Limit1].[PersonLogonName] IS NOT NULL
UNION ALL
SELECT
1 AS [C1],
[Extent3].[UserLogonName] AS [UserLogonName]
FROM [dbo].[User] AS [Extent3]
LEFT OUTER JOIN [dbo].[PersonLogonName] AS [Extent4]
ON [Extent3].[UserLogonName] = [Extent4].[PersonLogonName]
WHERE ([Extent4].[ExpirationDate] IS NULL)
OR ([Extent4].[ExpirationDate] > SysDateTime())) AS [UnionAll1]
) AS [Distinct1]
ORDER BY [Distinct1].[C2] ASC
Points of Interest
Using a meticulous troubleshooting process that broke the problem down into smaller pieces and concentrated on each piece individually, I was able to finally solve the larger problem with unexpected code. It was an interesting trip with a satisfying result.
Entity Framework version 6.1.3 was used for this article.
History
- 12th May, 2017: First version