Introduction
Recently, I was playing around with Entity Framework (EF) and evaluating it for some projects. I had a very hard time figuring out how to attach detached object graph to DBContext
in a fast and reliable way. Here, I am sharing a simple AttachByIdValue()
method implementation that can do this for you. If you are not interested in full explanation of the problem, jump straight to method implementation and start attaching your objects.
The Problem
Let’s say we are using EF in web app to implement page for managing Order
and OrderLines
. So we have parent-child relation (Order
and OrderLines
) and some referential data that is displayed but won’t be updated (Customer
and Products
).
We would typically query the above object graph from database (DB) using EF and send it to client (browser). When client sends this object graph back to server, we would like to persist it and in order to do so, we must first attach it to DbContext
.
The question is how to attach this detached graph without reloading it from DB and applying changes. Reloading form DB is performance hit and it is invasive. If I couldn’t do it without reloading, I would discard EF because this is very basic task that I expect my ORM to solve easily. Luckily, I found the solution after a lot of digging.
Add() or Attach()
There are two methods for attaching detached objects, Add()
and Attach()
, and they receive graph root object (Order
). Add()
method attaches all objects in graph and marks them as Added, while Attach()
also attaches all objects in graph but marks them as Unchanged.
Since our object group will usually have new, modified and unchanged data, our only option is to use one of these two methods to attach the full graph and then traverse the graph and correct state of each entry.
So which method should we choose?
Well, actually Attach
is not an option because attach
can cause key conflicts due to duplicate key values for same object
types. If we have Order
with two new OrderLines
, those OrderLines
would probably have Id = 0
. Attaching this Order
with Attach
method would break because Attach
will mark these two OrderLines
as Unchanged
and EF insists that all existing entities should have unique primary keys. This is why we will be using Add method for attaching.
Resolving New and Modified Data by Id Value
The question is how will we know the state of each object in graph (New/Modified/Unchanged/Deleted)? Because detached objects are not tracked, the only reliable way would be to reload the object graph form DB, and as I stated before, I don’t want to do that because of the performance.
We can use simple convention. If Id > 0
object is modified, and if Id = 0
then object is new. This is pretty simple convention but with drawbacks:
- We can’t detect unchanged objects so we will be saving to DB unchanged data.
On the bright side, these object graphs should not be that big so this should not be a performance issue. - Deleting objects must be handled with custom logic.
E.g., having something like Order.DeletedOrderLines
collection.
In order to read Id
value when attaching objects, all entities will implement IEntity
interface.
public interface IEntity
{
long Id { get; }
}
Ignoring Referent Data
Each object graph can contain referential (read-only) data. In our case, when we are saving Order
, we might have Products
and Customer
objects in graph but we know that we don’t want to save them in DB. We know that we should save only Order
and OrderLines
. On the other hand, EF doesn’t know that. This is the way AttachByIdValue
accepts array of Child types that should be attached for saving along with Order
. All objects in graph that are not root nor are of Child Type will be attached to context, but will be marked as Unchanged
so they won’t be saved to DB.
To save only Order
(without OrderLines
), we should call:
myContext.AttachByIdValue(Order, null);
myContext.SaveChanges();
So to save Order
and OrderLines
, we should call:
myContext.AttachByIdValue(Order, new HashSet<Type>() { typeof(OrderLine) });
myContext.SaveChanges();
Of course, the above HashSet<Type>
can be cached in static
field to avoid calling typeof
on every object attaching.
private static readonly HashSet<Type> OrderChildTypes = new HashSet<Type>() { typeof(OrderLine) };
...
myContext.AttachByIdValue(Order, OrderChildTypes);
myContext.SaveChanges();
public static void AttachByIdValue<TEntity>(this DbContext context,
TEntity rootEntity, HashSet<Type> childTypes)
where TEntity : class, IEntity
{
context.Set<TEntity>().Add(rootEntity);
if (rootEntity.Id != 0)
{
context.Entry(rootEntity).State = EntityState.Modified;
}
foreach (var entry in context.ChangeTracker.Entries<IEntity>())
{
if (entry.State == EntityState.Added && entry.Entity != rootEntity)
{
if (childTypes == null || childTypes.Count == 0)
{
entry.State = EntityState.Unchanged;
}
else
{
Type entityType = ObjectContext.GetObjectType(entry.Entity.GetType());
if (!childTypes.Contains(entityType))
{
entry.State = EntityState.Unchanged;
}
else if (entry.Entity.Id != 0)
{
entry.State = EntityState.Modified;
}
}
}
}
}
One Gotcha
As I explained earlier, EF insists that all existing entities should have unique primary keys and this is why you cannot attach to DbContext
two unchanged objects of the same type with the same Id. This shouldn’t be the case in general but I have found one edge case where it might occur. Let’s say we are loading Order
, OrderLines
and Products
and we have two different Order Lines pointing to the same Product. Normally, EF will set reference to the same Product
object to these OrderLines
unless you are loading your data using AsNoTracking to get better performance in which case each OrderLine
gets a reference to separate Product
object that is equal by all values. I didn’t find documentation of this behavior anywhere, I discovered it by accident while I was struggling to attach objects to DBContext
.