Intro
I've been recently working on a smart client (WinForms) application backed by Entitiy Framework 4 (EF4) on Visual Studio 2010. In this blog post I will try to give you some tips regarding some limitations and points to be careful about EF4.
Why Entity Framework (EF)?
Through my .NET development career I've used couple of ORM frameworks like NHibernate, LLBLGen and Subsonic. Each framework has some sort of unique approach to ORM. My favourite ORM was, and actually still is, NHibernate. I like the open source nature of NH and the community of NH is very active. LLBLGen was my second popular ORM because the tooling was inplace and entity mappings could be handled easly. Subsonic was minimalist and it was really a joy to develop with Subsonic. But each of these ORM frameworks made me crazy from time to time. To name few of them; configuring NHibernate is really a pain although they have some addins, LLBLGen has some sort of too imperative query constructs and Subsonic is minimalist and sometimes does not fit well while implementing some not so important part (classic %20 part) of a system. Oh I've forgot to mention LinqToSql which was the first shot of Microsoft in the ORM area. I've also used LinqToSql.
Since EF was first announced I've been watching the progress and keep an eye on the experience of people using early EF versions. My first impression was; stay away until next major release. And that major release seems to be EF4. My answer to the question why EF is
- EF4 comes as part of .NET Framework 4. No extra installation needed
- EF4 is fully integrated with Visual Studio 2010
- EF4 Model Designer, incorperated in VS 2010 is nearly fantastic
- With EF4 you can persist POCO's to supported databases easly
- With EF4 you can generate your entities from your database schema
- Complex Types are fancy
- Function exports (stored procedure support) just work like a charm. Paired with complex types this introduces a real power for brown field projects
Delete orphan entities automatically
In a typical parent/child construct most of the time you implicitly define a rule actually: child entities(records) can not exist without a parent entity(record). For example if you have an Order entity as parent and OrderLine entity as child you will expect OrderLine entities to be removed automatically when the parent Order entity is removed. This is a basic setup on the database side; you set "cascade on delete" and you are good to go. But from EF point this is not so simple. Even you have that setting on the database side when you try delete a parent entity and issue SaveChanges on your object context you will get
System.InvalidOperationException occurred Message=The operation failed: The relationship could not be changed because one or more of the foreign-key properties is non-nullable. When a change is made to a relationship, the related foreign-key property is set to a null value. If the foreign-key does not support null values, a new relationship must be defined, the foreign-key property must be assigned another non-null value, or the unrelated object must be deleted.
The problem is when you remove a parent entity, Order in our case, by calling DeleteObject on your ObjectContext automatically nullifies the Parent reference and tries to nullify the ParentId backing field of all related child entities, OrderLine in our case. But since we defined the ParentId field as non-nullable on the database side during SaveChanges we get the Invalid Operation exception. Actually the exception tells what you can do to bypass the problem. You can set ParentId to a default null object or you can explicitly perform DeleteObject for each related child entity. Both of these methods require some sort of code and you have to communicate this with your team mates. Luckilly a more automatic solution is in place; you just have to tweak your database schema to include the ParentId column (Foreign Key pointing to the parent Order record) as the second, or nth, primary key field. In this case the primary key of the OrderLine entity becomes a composite key having OrderId and ParentId columns. In this case EF4 shuts up and performs the delete smoothly, behind the scenes I guess EF4 infers that since the entity is deleted all entities participating to the EntityKey shall also be deleted. That is simple huh?
I hope future versions of EF will include DeleteOrphan setting which we would be able to set on Associations.
Change EF Connection String At Runtime
EF4 connection strings are different than usual Sql connection strings. Within EF4 connection string some sort of metadata regarding EF model resources shall be included so that EF can resolve the model and apply the mappings and other stuff you implicitly introduced through the Model Designer. In web application scenarios you likely will not have to worry about changing EF connection string at runtime since Model Designer asks you whether you want to persist the selected connection information in the config file. After that you simply change the sql connection information before you deploy your web app. But in smart client scenarios this technique does not fit well, since you do not want to expose database connection information in the app.config file. Most likely you will keep database connection information in a user specific setting file with some encryption applied since you do not want it to be human readable and when your application is up you will read that information, decode the connection string and use that value to issue a database connection.
Here is a typical Model Designer generated EF connection string
<add name="eXpressOtoEntities" connectionString="metadata=res://*/eXpressOtoEntities.csdl|res://*/eXpressOtoEntities.ssdl|res://*/eXpressOtoEntities.msl;provider=System.Data.SqlClient;provider connection string="Data Source=D00450065;Initial Catalog=eXpressOto;Integrated Security=True;MultipleActiveResultSets=True"" providerName="System.Data.EntityClient" />
You have to construct a connection string similar to the one above, for practical purposes you can define a factory method which constructs an EF Connection String ans instantiates an ObjectContext for you. Here is the sample method
public static eXpressOtoEntities CreateObjectContext()<br /> {<br /> if (DbConnectionManager.Current == null)<br /> throw new Exception("Current database connection is not set on DbConnectionManager");<br /><br /> string conStr = "metadata=res://*;provider=System.Data.SqlClient;provider connection string='" + DbConnectionManager.Current.ConnectionString + "'";<br /> eXpressOtoEntities ctx = new eXpressOtoEntities(conStr);<br /> return ctx;<br /> }
In the above code snippet eXpressOtoEntities, the return type of the method, is my ObjectContext class. DbConnectionManager is a custom static class which is used to encrypt/store, load/decrypt the database connection string. The rest is straight we just perform simple string concat.
Implementing IDataErrorInfo on your entities
IDataErrorInfo interface is a standard .NET interface residing in System.ComponentModel namespace which provides the functionality to offer custom error information that a user interface can bind to. Most of the WinForms controls, standard or third party, support IDataErrorInfo internally or through the standard .NET ErrorProvider component. If you want to provide error information to the end users regarding any kind of validation directly from the entities layer you just have to implement IDataErrorInfo on your entities, which is simple enough.
EF4 uses T4 templates to generate your entity classes from your model. When you add an entity to your model from your database schema EF runtime, VS2010 actually, uses predefined T4 templates to generate the corresponding entity classes. There are two different ways, as far as I know, you can add extra functionality to your EF entity classes.
1) You can create your own T4 template incorporating IDataErrorInfo and generating default implementation of the interface and feed the EF runtime with this template. This is a little bit complicated issue and I can ensure you that you would not want to enter that process just to have entities supporting IDataErrorInfo. You can read this post for the details.
2) Standard EF T4 template generate partial classes for your entities. Which means you can add functionality to your automatically generated entity classes by simply creating a partial class file. This approach is simple and sufficint for most of the time. Below is a sample partial entity class with IDataErrorInfo implementation
<br /> partial class ProjectType : IDataErrorInfo<br /> { <br /><br /> #region IDataErrorInfo Members<br /><br /> public string Error<br /> {<br /> get<br /> {<br /> StringBuilder sb = new StringBuilder();<br /><br /> if (!String.IsNullOrWhiteSpace(this["Name"]))<br /> sb.AppendLine(this["Name"]);<br /><br /> if (!String.IsNullOrWhiteSpace(this["Prefix"]))<br /> sb.AppendLine(this["Prefix"]);<br /><br /> string errors = sb.ToString();<br /> return !String.IsNullOrEmpty(errors) ? errors : String.Empty;<br /> }<br /> }<br /><br /> public string this[string columnName]<br /> {<br /> get<br /> {<br /> switch (columnName)<br /> {<br /> case "Name":<br /> return String.IsNullOrWhiteSpace(Name) ? "Name can not be empty" : String.Empty;<br /> case "Prefix":<br /> return String.IsNullOrWhiteSpace(Prefix) ? "Prefixcan not be empty" : String.Empty;<br /> default:<br /> return String.Empty;<br /> }<br /> }<br /> }<br /><br /> #endregion<br /> }
The sample class above adds IDataErrorInfo implementation to my auto generated ProjectType entity class and is used to check if required fields Name and Prefix have values. That is it.
Handling General Definition Data
Nearly every system needs some sort of general definition data during operation. For example all web sites which require registration ask for country which is a general definition data. Most of the time that sort of data does not need extra or complex processing, this data is not at the core of the operation, but may be a core part in BI. System admins just define the data and the applications display that data to the end user. Schema of that sort of data is very simple just and Id and a Name field is enough, sometimes a description field can also be included. In our project we generalized that sort of data to include Id, Name and Active fields and defined a table for each type. May be we could handle this sort of data with a single table on the database side and utilize an inheritance strategy on the EF side. But for simplicity we just decided to discard this possibility.
We decided to provide a unified editor to the admins so that they can manipulate definition data. We designed the editor to operate with interfaces so that we can handle any general data definition entity implementing a contract, in a way some sort of data contract. Lets walk through the process step by step and give some code
Step-1: Define the data contract, that is IDefinitionDataEntity interface
public interface IDefinitionDataEntity<br />{<br /> Int32 Id { get; set; }<br /> string Name { get; set; }<br /> bool Active { get; set; }<br />}<br />
This is simple we just define an data contract defining the structure of our general data.
Step-2: Mark definition data entities with IDefinitionDataEntity interface
[Description("Vehicle Kinds")]<br /> partial class VehicleKind: IDefinitionDataEntity<br /> { <br /> }<br />
We just define a partial class for our VehicleKind entity class, which was automatically generated by EF. Since our database schema contains Id, Name and Active columns actually EF generated VehicleKind class already contains these properties and we just simply mark our partial class with IDefinitionDataEntity.
NOTE: Description attribute is a standard .NET framework attribute which resides in System.ComponentModel namespace. We will use this attribute to render user friendly entity information in our data editor.
Step-3: Get all entity types implementing the IDefinitionDataEntity interface with reflection
public class DefinitionDataEntityTypeInfo<br />{<br /> public Type EntityType { get; set; }<br /> public string Description { get; set; }<br /> public static ReadOnlyCollection<DefinitionDataEntityTypeInfo> TypeInfos <br /> { <br /> get;<br /> private set; <br /> }<br /> static DefinitionDataEntityTypeInfo()<br /> {<br /> PrepareTypeInfos();<br /> }<br />}<br />
DefinitionDataEntityTypeInfo class is just a helper class which will be used to hold the information about the definition data entity classes which is populated through reflection. TypeInfos is where we hold the data for all definition entities.
private static void PrepareTypeInfos()<br />{ <br /> Type tt = typeof(IDefinitionDataEntity);<br /> <br /> var results = ( <br /> from a in <br /> ( from type in Assembly.GetExecutingAssembly().GetTypes()<br /> where type.GetInterface(tt.FullName, true) != null<br /> select type <br /> )<br /> where a.IsClass == true<br /> select new DefinitionDataEntityTypeInfo { EntityType = a}<br /> ).ToList<DefinitionDataEntityTypeInfo>();<br /><br /> results.ForEach<br /> (<br /> delegate(DefinitionDataEntityTypeInfo t)<br /> {<br /> var z = (from x in t.EntityType.GetCustomAttributes(typeof(DescriptionAttribute),false)<br /> select (DescriptionAttribute)x).FirstOrDefault < DescriptionAttribute>();<br /> if (z != null)<br /> t.Description = z.Description;<br /> } <br /> );<br /><br /> TypeInfos = results.AsReadOnly();<br /> }<br /><br />
PrepareTypeInfos static method of DefinitionDataEntityTypeInfo class is called inside the static constructor and simply inspects all entity classes implementing IDefinitionDataEntity interface
and stores the inspection data to TypeInfos static property.
Step-4: Get entity set name for of a definition entity type
In order to query and modify data we must have strongly typed entity sets for our definition entity classes or inspect the name of the entity set for each entity type inside our object context. To meet this requirement we create a partial class for our ObjectContext implementation and introduce tqo methods for our purpose; GetEntitySet and GetEntitySetName. Here is the implementation
partial class eXpressOtoEntities<br />{<br />
We use the metadata, actually included in the automatically generated EDM, attached to our ObjectContext implementation (eXpressOtoEntities) to inspect the entity set name for a specific definition entity type.
NOTE: Methods introduced here can be used to get entity set or entity set name of any EF entity class as well.
Step-5: Load definition data
The final step is loading the our strongly typed definition data inside our unified editor based on the user selection. Here is the code.
private void LoadData(Type entityType)<br />{<br /> ObjectQuery<IDefinitionDataEntity> qry = _ctx.CreateQuery<IDefinitionDataEntity>(String.Format("{0}", _currentSetName));<br /><br /> var z = (from x in qry<br /> where x.Active<br /> orderby x.Name<br /> select x);<br /> _currentSet = z;<br /> bs.DataSource = _currentSet;<br />}
NOTES:
- _currentSet is of type object and is used to hold the data returned by our object query which in turn we use as the DataSource of our bs (BindingSource)
- We only load Active entities(records)
- entityType parameter comes as a result of user selecting the definition data through a lookup control. We idenitfied the real type of definition entities in Step-3 and filled the lookup control.
Here is how the final editor looks like
This is a Turkish application so Tanımlar is the lookup where user selects the definition entity, the description displayed in the lookup directly comes from the Description attribute we have talked about in Step-2
Handling Concurrency Exceptions
EF4 supports optimistic concurrency scenarios through Concurrency Mode property residing on the entities. You just simply set Concurrency Mode = Fixed on any property you want to be included in the concurrency check process and EF handles the rest for you. Behind the scenes EF includes the old values, that is values when the entity was materialized for the first time in the ObjectContext, of the properties incorporating to the concurrency checks in the where clause of update and delete statements. When the statment is executed against the database and returned row count is zero EF infers that record was modified by another user/process which in turn causes concurrency exception to be thrown. Simple and powerfull for most scenarios. But I shall warn you about using timestamp columns for this scenario. Typically in Sql Server timestamp column is used to identify if the record was modified since the last time you fetched the record. But in parent/child constructs in case of a child entity update EF also issues a fake update on the parent entity which causes the timestamp value of the parent to be updated, which in turn causes concurrency exceptions for the parent entity even the entity was not explicitly modified by another user or process. This sort of behaviour is caused by the assumption that when a child entity of a parent entity is modified this also might have caused a conceptual modification to the parent entity. But this assumption does not apply to all, may be most, the time. In this case you have to implement your own strategy for modification marking. A simple solution can be found here. We used the sample there as a starting point to implement our own.
Too much intro to optimistic concurrency, anyway. When you get a concurrency exception you have two choices
1) Inform user about the problem and reload the data automatically
2) Inform user about the problem and give user the option to decide what to do; reload data (StoreWins) or overwrite the data on the database (ClientWins)
We went with option 2 and present a dialog whre our users can select what to do next. This kind of implementation is not very straight forward, especialyy if you are dealing with complex user interface presenting lots of related information for a single entity(record). We faced some problems with the Refresh method of the ObjectContext. You need to know exactly which entity or entityset to refresh which is sometimes not possible, because you just simply bound your user interface to propertes of an entity.
Here is a sample SaveChanges implementation where we try to handle concurrency exceptions
There is lots of code in this method but I just want you to concantrate on these lines
- Line 56: We call SaveChanges to persist changes to the database
- Line 68: We catch the optimistic concurrency exception
- Line 74: We present option dialog where our users can choose what to do next
- Line 78 and Line 79: We call Refresh with user selected mode for our Project entity and ComissionMembers EntitySet (Luckily we know what to refresh :)
- Line 100-118: If user selected ClientWins mode we re-issue SaveChanges so that user changes will be persisted to the database and overwrite the values in the database
Here is our dialog where we ask our users what to do next in case of a concurrency exception. Again the dialog is in Turkish, for clarification first option states that StoreWins, that is changes of the user will be discarder, second options states that ClientWins, that is changes made by the user will overwrite the database.
UnitOfWork (ObjectContext) Decisions on WinForms
UnitOfWork is a core concept introduced in all ORM frameworks implicitly or explicitly and each ORM wrap/implement this concept through some constructs. EF4 has ObjectContext class and EF generates a named class inherited from ObjectContext for your use. If you want EF to get data or persist data from a database your entry point is your named ObjectContext class. EF materializes your entities inside your ObjectContext instance, and once you detach your entities from your object context you can not persist them to the database anymore.
Creating an instance of ObjectContext is not performance sensitive action, but when you begin populating the object context with entities you must be very carefull. If you keep you object context instance in memory for a long time and perform too much data operations your object graph may get too complicated and you will have thousands of objects in your object context, which in turn will cause you major performance issue on the client side. So be carefull when instantiating object context instances and plan carefully for how long they will be alive and do not forget to dispose them when you are finished.
In web application scenarios most ORM frameworks recommend that boundaries of UnitOfWork shall be defined by the request, that is when request arrives your unit of work starts and when request is completed/processed the unit of work ends. This is pretty effective strategy for web, but for smart client applications we do not have requests. We have to plan carefully and may be apply different strategies based on the application flow. Anyway, still we have some clues for WinForms
- ObjectContext per form
- ObjectContext per user control
- ObjectContext per use case
In our application we used all of the strategies specified above. We have forms with single object context, we have user controls which manage their own object context as well as some user controls just use the object context of the parent form. We also use seperate object context per use case in some wizard style interaction scenarios. Overall be carefull while defining your unit of work, else you will have memory issues and have to handle detached objects manually.
Audit Logging with Object Context
Most of the applications provide some sort of audit/trail logs in different detail levels. Some applications just keep track of who/when created/modified the record as the property of the actual record, some other keep more detailed information in seperate databases or tables. In our application we used both approaches, for some not so critical data we just track who/when created/modified the record, for some sensitive and more complicated data we keep track of changes in seperate tables.
With EF4 audit logging is some sort of simple, you just hook to SaveChanges event of your ObjectContext class and get information about the changed entities from ObjectStateManager of your object context instance. This is trivial, here is an example
public class ContextInterceptor : IDisposable<br />{<br /> private eXpressOtoEntities context;<br /><br /> public ContextInterceptor(eXpressOtoEntities context)<br /> {<br /> this.context = context;<br /> this.context.SavingChanges += new EventHandler(WhenSavingChanges);<br /> }<br /><br /> <br /> public void Dispose()<br /> {<br /> if (this.context != null)<br /> {<br /> this.context.SavingChanges -= new EventHandler(WhenSavingChanges);<br /> this.context = null;<br /> }<br /> }<br /> <br /> void WhenSavingChanges(object sender, EventArgs e)<br /> {<br />
ContextInterceptor class is just a utility class, you could write the code in your ObjectContext class as well.
partial class eXpressOtoEntities<br /> {<br /> <br /> partial void OnContextCreated()<br /> { <br />
We simply create ContextInterceptor whenever a new ObjectContext instance is created.
So far so good, hooking is trivial inserting log records is simple too; create audit log entities and attach them to the context and save as usual. But there is an important point you should be aware of, which is optimistic concurrency. You get optimistic concurrency exception after your call to SaveChanges and your delegate WhenSavingChanges is executed, you will get the exception after attaching your log entities to the context. If you provide recovery method to the users, as I explained in the "Handling Concurrency Exceptions" section users will be able to issue another SaveChanges on the same context which will cause another set of audit log entities to be created and you will persist duplicate log entries to the database. To avaoid this problem you shall identify and remove detached log entries at the beginning of your WhenSavingChanges delegate. Here is the modified version of WhenSavingChanges method of ContextInterceptor class
void WhenSavingChanges(object sender, EventArgs e)<br /> {<br /> context.DetachAdded<VehicleModelZeroPriceLog>();<br /> context.DetachAdded<JobOrderStateLog>();<br /> <br />
DetachAdded<T> method simply detaches newly added entities of a specific type, in our case we use this method to detach any newly created entities of type VehicleModelZeroPriceLog and JobOrderStateLog. So with this little utility we will not get duplicate log entries in case of a recovery scenario from concurrency exceptions
Conclusion
Working with EF4 was a smooth experience, having an ORM background it was pretty easy to discover the shortcomings stated in this post. I even had fun working with EF4. I also posted some bugs and improvements regarding the Visual Studio 2010 and Model Designer to Microsoft.
I'm also looking forward to test how EF4 fits in a more layered architecture and if it is possible to implement the Repository Pattern. I will try to post as I experience different aspects of EF4.
Stay tuned...
CodeProject