Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

LINQ-to-EF Efficiency: A Cautionary Tale

4.33/5 (3 votes)
26 May 2012CPOL2 min read 21K  
Avoid Entity Framework lazy-loading relationship navigation when iterating large record sets.

Introduction

I like using SQL Server’s SQL Profiler to snoop on the TSQL that Entity Framework and Linq generate. They produce dynamic procedures that are often remarkable and sometimes puzzling.

If you observe less than satisfactory performance in your Linq queries or suspect unnecessary load on your network, it is often useful to peel back the wizard’s curtain and watch what’s happening behind the scenes. A change to one of our Linq routines recently yielded big benefits in efficiency.

Background

Several of our applications provide end users with a combobox from which they can choose a Customer. I like to supplement the Customer’s name with locale information so that users can select the correct account of Customers with multiple locations. The database tables that store that data look like the provided diagram.

Using the Code

Each Customer account can have multiple Address Types – "Billing", "Shipping", etc. Each of them have a single related Country record. I want each row in my combobox’s dropdown to look like <Customer Name> <Country> <Address>.

We transfer Entity data into strongly-typed datasets. The original code for doing this is inside of a Customer DataTable partial class and was similar to:

C++
// Establish the ObjectContext (which opens the data connection)
using (Base.Data.DLTIEntities edmDLTI = new Base.Data.DLTIEntities)
{
       // Define the query (this does NOT cause a trip to the database yet)
       IQueryable<Base.Data.Customer> qryCust = from c in edmDLTI.Customers
              where c.ActiveInd || c.CustomerName == "Unknown Customer"
              orderby c.CustomerName, c.CustomerNumber
              select c
              ;

    // The ToArray() call will cause the query to execute a single round trip to the database.
       foreach (Durcon.Base.Data.Customer entCust in qryProjCust.ToArray<Base.Data.Customer>())
       {
              // Assign DataTable field values and add the DataRow to the DataTable
              CustomerRow dtrCustomer = NewCustomerRow();
              dtrCustomer.ActiveInd = entCust.ActiveInd;
              dtrCustomer.CustomerID = entCust.CustomerID;
              dtrCustomer.CustomerText = entCust.CustomerName.Trim() + " (" 
                     + entCust.CustomerNumber.Trim() + ")"
                     ;

              if (entCust.CustomerAddresses.Count > 0)
              {
                     // Get the Customer’s "first" address as dictated by its Address Type (which is usually the "Billing" address)
                     Base.Data.Address entAddr = entCust.CustomerAddresses.OrderBy(ca => ca.Address.AddressTypes.AddressTypeDesc).FirstOrDefault<Base.Data.CustomerAddress>().Address;
                     CustomerText += " " + entAddr.City.Trim() + (entAddr.State.Trim().Length == 0 ? "" : ", ")
                           + entAddr.State.Trim()
                           + (entAddr.Country.CountryDesc.Trim().Length == 0 || entAddr.Country.CountryDesc.Trim() == "US" ? "" : ", " + entAddr.Country.CountryDesc.Trim())
                           ;
              }

              AddCustomerRow(dtrCustomer);
       }

// Close the database connection.
}

The ability to easily navigate relationships via EF’s lazy loading of data is seductive. Like so many other super powers, it must be used carefully and only for the greater good.

Each of the bolded segments above invokes a round-trip to the database server every time they are encountered. We would expect the first one because we need a set of Customers through which to iterate. But then, within the iteration loop, each of those bolded segments require a database round-trip for each Customer record.

For a list of 1,000 Customer records, this caused more than 4,000 round-trips to the database. Worse, Linq-to-Entities usually executes multiple TSQL transactions per round-trip – one to establish parameters and their values, another to execute the SQL statement and so on. Multiply that by every Customer combobox. And then by every end-user. The approach above required a full 6 seconds to fill each list and an obscene number of database transactions.

Please don’t tell our Network Administrator.

The Microsoft gurus often tout the Linq-to-EF designed segregation between database server execution and .NET code execution. My new nuance on that is that the two should be segregated as far between as possible.

The quest for our more efficient approach resulted in the following code:

First create a place-holder class:

C++
public class ListCustomer
{
       private bool booActiveInd = false;
       private long lngCustomerID = 0;
       private string strCustomerText = "";
       private string strCustAddr = "";

       public bool ActiveInd { get { return booActiveInd; } set { booActiveInd = value; } }
       public long CustomerID { get { return lngCustomerID; } set { lngCustomerID = value; } }
       public string CustomerText { get { return strCustomerText; } set { strCustomerText = value; } }
       public string CustAddr { get { return strCustAddr; } set { strCustAddr = value; } }
}

Then, suffer a little more complexity in the initial query design, and still coding within the strongly-typed Customer DataTable mirrored partial class:

C++
// Establish the ObjectContext (which opens the data connection)
using (Base.Data.DLTIEntities edmDLTI = new Base.Data.DLTIEntities)
{
       // Create an all-encompassing query statement that will ensure a single database round-trip 
       IQueryable<ListCustomer> qryQuoteCust = from c in edmDLTI.Customers
              where c.ActiveInd || c.CustomerName == "Unknown Customer"
              orderby c.CustomerName, c.CustomerNumber
              // row-append single instances of the place-holder class
              select new ListCustomer
              {
                     ActiveInd = c.ActiveInd,
                     CustomerID = c.CustomerID,
                     CustomerText = c.CustomerName.Trim() + " (" + c.CustomerNumber.Trim() + ") ",
                     CustAddr = (  from ca in edmDLTI.CustomerAddresses
                                         // Despite what we’ve read, these all resolve to TSQL INNER JOINs.  Watch SQL Profiler.
                                         join a in edmDLTI.Addresses on ca.AddressID equals a.AddressID
                                         join at in edmDLTI.AddressTypes on a.AddressTypeID equals at.AddressTypeID
                                          join co in edmDLTI.Countries on a.CountryID equals co.CountryID
                                   where ca.CustomerID == c.CustomerID
                                  orderby at.AddressTypeDesc
                                  select a.City.Trim()
                                         // These c# "if" clauses create interesting CASE clauses in TSQL.  Watch SQL Profiler.
                                         + (a.State.Trim().Length == 0 ? "" : ", ") + a.State.Trim()
                                          + (co.CountryDesc.Trim().Length == 0 || co.CountryCode.Trim() == "US" ? "" : ", " + co.CountryDesc.Trim())
                     // Grab only the first record that respects the Address Type "orderby" sort.  Emulates a TSQL "TOP 1" clause.  Also one of the few ways to emulate a true TSQL "LEFT OUTER JOIN".
                     ).FirstOrDefault(),
              }
              ;
       // A single TSQL statement and database round-trip is executed here.
foreach (ListCustomer licCust in qryQuoteCust)
{
    CustomerRow dtrCust = NewCustomerRow();
    dtrCust.ActiveInd = licCust.ActiveInd;
    dtrCust.CustomerID = licCust.CustomerID;
    dtrCust.CustomerText = licCust.CustomerText + licCust.CustAddr;
    AddCustomerRow(dtrCust);
}
// Close the database connection.
}

Points of Interest

All data gathering duties were performed by the initial query and there were no subsequent visits to the database for additional relational data – no lazy loading. Oh what a change in approach can garner ...

Execution time: Less than one second. Database round-trips: 1

Please tell our Network Administrator.

The end does not always justify the means.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)