Introduction
In this short article I'd like to share my latest investigation on choosing an ORM to use in a .NET application with Oracle as the DBMS. To be more precise I considered only two of them: the Microsoft Entity Framework 5 and NHibernate 3.3 (from here and below, I will omit pointing versions). The article describes the exact issues and considerations I faced and I hope it will be useful for the audience.
Background
I’ve got a chance to start a new medium-sized .NET application in invoicing domain area. Oracle 11g has been chosen by our client as the DBMS. And the performance of the invoice processing is one of the key constraints.
Of course, ORM was one of the first decisions I had to perform. After a few weeks of trying both Entity Framework and NHibernate, I’ve collected quite an interesting list of issues described below. I’d like to point out that this experience has not touched naive things like basic mappings or class hierarchy strategies. Instead, I have focused on strictly practical issues I have faced with, and in my opinion worth sharing with other developers and architects. And I will be appreciate it if someone finds any mistakes or misunderstandings.
Model
From here and further through the article, I will use the following simplified model. Each issue will appeal to a certain point in it.
public enum CustomerType
{
Private = 0,
Corporate = 1
}
public abstract class Entity
{
public virtual Guid Id { get; set; }
}
public class Customer : Entity
{
public Customer()
{
CustomerComments = new Collection<CustomerComment>();
Invoices = new Collection<Invoice>();
}
public virtual string CustomerNumber { get; set; }
public virtual bool IsActive { get; set; }
public CustomerType CustomerType { get; set; }
public DateTime? StartDate { get; set; }
public virtual ICollection<CustomerComment> CustomerComments { get; set; }
public virtual ICollection<Invoice> Invoices { get; set; }
}
public class Invoice : Entity
{
public virtual Customer Customer { get; set; }
public virtual string Number { get; set; }
}
In terms of DDD there are two root aggregates: Customer
and Invoice
, and the Customer
aggregate contains CostomerComments
as a plain child collection.
Basically, it is discussable whether to include the collection of Invoices in the Customer
class from the DDD aggregate's perspective, but it is not a point right now and let it be as is.
Issue 1. Parent-Child relations during loading
There is a subtle difference in working with child collections loading in Entity Framework and NHibernate. Let’s assume we load a group of customer records (e.g., by some filter filled by an end-user) and a group of invoices (again, with some filter filled by an end-user, e.g., for last month). Both groups are loaded via separate calls. The key point is that we definitely know that loaded invoices are connected to loaded customers.
In terms of Entity Framework the calls would look like
List<Customer> customers = Context.Customers.Where(some condition);
Context.Invoices.Where(some condition).Load();
Where Context
is the same instance for both calls even though the calls might (and should) be in different repositories. In terms of NHibernate, the same calls would look like:
List<Customer> customers = Session.Query<Customer>().Where(some condition).ToList();
Session.Query<Customer>().Where(some condition).ToList();
And again, Session
instance is the same for both calls.
So, the difference is:
- Entity Framework will connect Customers with invoices in both ways: collection of invoices in Customer entity will be filled with appropriate invoices and Customer reference of each invoice will be filled with the appropriate Customer entity.
In other words, it means that
customer.Invoices.Count > 0
for a certain Customer which has invoices in
the second call. - NHibernate will connect Customers with invoices only in one way: collection of invoices in Customer entity will not be filled with
the appropriate invoices whereas the Customer reference of each invoice will be filled with the appropriate Customer entity.
In other words, it means that
customer.Invoices.Count == 0
for all the customers loaded unless we touch our collection somewhere in code that leads to loading it by proxy.
I should say that Lazy Load mode is ON for this example. Otherwise, NHibernate will load
all the invoices under all loaded Customers at the point of loading Customers.
The fundamental semantic idea why NHibernate does so is the idea of
aggregates. But here I point only the fact as is. In addition, one of the possible solutions for NHibernate if we need this working the same way as Entity Framework:
Issue 2. GUID primary key
Oracle does not have a dedicated GUID field type. It uses the RAW(16)
type instead. Unfortunately, in Code-First approach there is no way to tell Entity Framework what the exact database field type it has to use for the primary key property in mappings (e.g.,
Entity.Id
property in our model). The default database field that
is used by Entity Framework for GUID is RAW(2000)
(!). The only way to avoid this problem is to move to the Model-First approach wherein
the SSDL file is possible to put the needed database field type of RAW(16)
:
<Property Name="ID" Type="raw" Nullable="false" MaxLength="16" />
NHibernate works well with GUID property by default.
Issue 3. Boolean fields
One of the most surprising issues is that I have not found any way to map a Boolean property in Entity Framework (e.g., Customer.IsActive
property in our model). Oracle does not have a dedicated type to store Booleans (like SQL Server bit type) and usually NUMBER(1,0)
is used. But it
appears that it is not possible in any approach to map a Boolean.
NHibernate works well with Boolean property by default.
Update:
Thanks to answering on my article, I've re-checked my investigation and found that probably I had made a mistake and actually it is possible to map to boolean using config:
<oracle.dataaccess.client>
<settings>
<add name="bool" value="edmmapping number(1,0)" />
</settings>
</oracle.dataaccess.client>
Issue 4. Mapping enum properties
At first glance the mappings enum looks fine in both ORMs (e.g., Customer.CustomerType
property in our model). And at the beginning I did not find anything suspicious in it. But moving further and after analyzing queries produced by Entity Framework, I noticed a strange thing. If I use the enum field for filtering I get significant performance issues despite using
an index for the CustomerType
field:
List<Customer> customers = Context.Customers
.Where(c => c.CustomerType == CustomerType.Private)
.ToList();
By analyzing the SQL query produced by Entity Framework I found that it covers the enum field with
a CAST
expression like this:
... WHERE CAST(some_alias.CustomerType as NUMBER(38, 0)) = :plinq1
Moreover, in the case of nested queries (e.g.,
.Include()
was used in
the LINQ query), the SQL expression would be with nested
CAST
expressions:
... WHERE CAST(CAST(some_alias.CustomerType as NUMBER(38, 0))) = :plinq1
The problem here is that Oracle does
not use the index created for the field
CustomerType
if the field is covered with any function. In order to improve performance you should use
a function-based index instead of
CAST(CustomerType as NUMBER(38, 0))
. And what about nested queries? Nested queries produce nested
CAST
s:
CAST(...(CAST(CustomerType as NUMBER(38, 0)))...)
The additional indexes with nested
CAST
s is needed!
Unfortunately, I have not found a way neither to remove this CAST
nor to override
the SQL generation in Entity Framework to produce a SQL query without those CAST
s.
NHibernate works well with enum properties by default.
Issue 5. Null-object pattern on database level
Another subtle issue I have found due to performance improvement investigations is
NULL
columns in the database. It appears that Oracle works best when filtering, joining,
and ordering columns are NOT NULLABLE
. I will not go deeper in explanation why
this happens because it is not the point of this article. So it is crucial to avoid marking certain fields in the database as
NULLABLE
.
The idea is to apply the null-object pattern only on the database level by having dedicated values which
are interpreted as NULL
s in the Business Layer. And perform an appropriate conversion back and forward on the ORM level for still having nullable fields in
the model. There are two types of fields possible to be made nullable in model classes:
- Simple ones (e.g.,
Customer.StartDate
) - References (e.g.,
Invoice.Customer
)
So, my goal is to convert these to pre-defined values during loading from and saving to the database on the data access layer. E.g.,
a DateTime
value converts to a DateTime.MaxDate
value, and
a reference Invoice.Customer
converts to the special Customer which is selected to be the null-customer.
Unfortunately I have not found any way to do this with Entity Framework.
NHibernate has that possibility via implementation of the IUserType
interface and applying it with mapping. At first glance such
an implementation looks a bit messy, but only a couple of lines are needed to be changed in
the default implementation.
Issue 5. LINQ support
Entity Framework has the most powerful LINQ support. Although NHibernate has
the possibility to use LINQ via Session.Query<>()
, it is still not as rich as
what Entity Framework has. On the other hand, NHibernate has the powerful QueryOver()
mechanism which can solve any complicated issue in querying. And as soon as ORM and its queries are usually hidden beyond Repository and/or Query patterns, it is not a problem to use a mix of querying mechanisms if needed.
Issue 6. Extensibility and flexibility
In terms of extensibility and flexibility, NHibernate is still far more mature. It has a lot of points to extend/override/intercept its functionality. I was a bit surprised that Entity Framework has such a poor choice of points to extend/override/intercept its features.
Conclusion
Unfortunately, Entity Framework still does its best only with SQL Server. And if you have a project with Oracle DBMS underlying, do not hesitate: NHibernate is still the winner.