For good or bad, Microsoft’s Entity Framework (EF) has become one of the most widely used ORM tools out there. While some may state that it's not among the better ones (or that it's not even a real ORM), it’s definitely the most convenient one: its Visual Studio integration and consequently its integration with the MS SQL Server database is unparalleled. This is especially relevant to people who haven’t used the concept of object-relational mapping before - there's almost no initial learning curve associated with the framework. While it would theoretically be possible to use EF with a DBMS other than MS SQL Server, it’s in practice almost exclusively used in combination with Microsoft's database engine.
Lately, I wondered how you would ‘mock’ EF’s data model - together with an underlying MSSQL database - in such a scenario. The question arose in the broader context of writing integration tests (automated UI tests, to be exact) for a legacy application which uses the described EF/MSSQL combo as its data store mechanism. It turned out that the possible strategies are not very obvious and require some additional tools and effort...
This is part 1 of a three part series that examines some ways to cope with the above described scenario. While the next two parts are more concerned with concrete implementation details, this part discusses some more general considerations on the issue and introduces the sample application which serves as a 'target' for the tests that will be presented in the latter parts.
Handling the Data Layer in Tests
In general, things tend to become a bit tricky when it comes to testing database-related parts of an application. But how would you go for database testing? is an often heard question from people who are new to unit testing. For good reasons, because database related testing is one of the more demanding areas in the testing field, mainly because of two aspects:
- There’s not a single, easy-to-use framework to test the database in complete isolation – no matter what DBMS you use.
- More often than not, the business logic of the application is partly implemented by the application's data access code, and partly by the database itself (e.g. through stored procs, triggers, constraints, etc.).
Moreover, if we take a closer look at the question (How would you go for database testing?), it becomes clear that, effectively, we're talking about two different scenarios here:
- Actually testing the data layer of a project.
- Stubbing out the data layer to enable other (integration-style) tests.
Testing the Data Layer
The best way to test an application’s data store mechanism is to test it in team with the data access code that sits on top of it. Usually, your test code will access the production code which is closest to the database in such a case. Roy Osherove, in his book The Art Of Unit Testing, has the following to say about it:
I usually write integration-style tests for the data layer (the part of the app structure that talks directly to the database) in my applications because data logic is almost always divided between the application logic and the database itself (triggers, security rules, referential integrity, and so on). […], the only way to make sure it works in tests is to couple testing the data-layer logic to the real database. (p. 277)
For a project with EF, you would write tests against the EF data model, thus acting as a data model client. The underlying database could be kept constant in such a scenario for example by wrapping each test in a database transaction, or by resetting the entire database at the required points – either by running some scripts or by using the DB server's backup/restore mechanism.
I won't go into more details here because, as described above, it's the second scenario which served as a starting point for this article series:
Stubbing Out the Data Layer
Besides the necessity to test an application's data layer directly, we also need a way to simulate the entire data access functionality of an application in order to run other kinds of (integration-style) tests that somehow rely on the data access part, without actually testing it.
If your application architecture is made up along the lines of Domain-driven design, the general layer architecture might look roughly like this:
Note that this is only one out of many possible architectures, and it's in no way mandatory for using the tools and techniques which are presented here.
The important piece is that you have some client code against that you want to write tests, and that there is an EF data model somewhere down the chain.
(Note: The 'client' here actually is the Repository, and Client code could be anything that calls it. The diagram admittedly is somewhat misleading in that respect...)
Such tests could be for example integration tests against an application's Business Logic layer, or some User Acceptance Tests on the UI level, involving a UI automation framework like e.g. WatiN, Selenium, or White.
Also, you could use any DBMS that works with EF (here's a list). The basic story remains the same...
Special Issues with MS Entity Framework
Writing tests against code that invokes EF is not easy – obviously, the Testing issue wasn't a concern at all when developing the framework. The two biggest stumbling blocks on the way to writing tests are:
- EF stores the used data provider type in its mapping file (*.edmx), which in turn is compiled as an embedded resource into the respective assembly. In consequence, this means: The specification of the DBMS technology is part of the application code and cannot be changed later. So there's no way to replace the underlying database with an embedded one like SQLite or SQL Server CE. You only can alter the location of the used database, but you have to stick to the once chosen database type.
- EF does not adhere to the standards of Interface-based programming. There are abstract classes, sealed classes, and classes with non-public constructors everywhere – but not a single interface. This means that using one of the free mocking frameworks like e.g. Moq, fakeiteasy, or NMock is also not an option.
The Different Approaches
The above outlined restrictions leave us with these options (at least these are the ones that came to my mind):
- We have a 'static' database in place and wrap each test into a database transaction, or
- we mock (or fake) the EF code directly, using the (commercial) Typemock Isolator tool, or
- we set up a dedicated instance of the SQL Server database and feed it with the required test data via the NDbUnit library.
Regarding the first strategy, I won't discuss it here in depth (note however that it is included with the sample solution – see the CourseManager.Test.MsSqlServer
project), because this article series is more about mocking test data, whereas this approach uses a predefined set of data. Of course, you could extend it in various ways to cope with specific needs, but this would be a different story. I'd only like to mention here that this approach could bring its own set of problems:
- You're writing the tests against a fixed set of data, which makes this approach somewhat inflexible and sometimes may even prevent you from covering a specific scenario (of course, there are various ways to cope with such a situation, but they are not easily implemented and require quite some additional effort).
- Because you're heavily using transactions, it's easy to run into specific issues from this side (especially if your test wants to use a transaction of its own, when the entire test method is already wrapped in a transaction).
The other two approaches (Typemock and NDbUnit) are fairly different in how they do things: While the Typemock approach directly intercepts the calls to the EF classes, NDbUnit manipulates the underlying database. In other words, the two approaches act upon different levels of an application's data access system, as shown below:
Also, the two methodologies are not equivalent in terms of what is best in which scenario. We’ll see that in more detail in the next parts, but the bottom line is this:
- Typemock is the preferable tool when developing an application or feature, because it's easy to alter a return value, you usually get an immediate compile error if your test code is not in sync with the database schema, and it's lightning fast, because there is no disk I/O whatsoever involved. The biggest drawback of Typemock is the fact that things easily become very expensive and hard to maintain when you have to fake things like referential integrity constraints and/or a massive amount of data.
- NDbUnit on the other hand shines when writing tests against a relatively stable database schema, having a huge amount of data, or when you are heavily concerned with referential integrity constraints and their interrelations. It lets you declare all of your test data in external XML files, which then can be loaded on a per-test basis. The most prominent downside here lies in the framework's lacking robustness against schema changes: If there are any error messages, then you will get them only during runtime, and chances are big that the error messages will not directly point you to the underlying error.
The Sample Target Application: Course Manager
We start with the Course Manager sample application as taken from the MSDN Quickstart. It’s a very simple EF app with only the bare minimum:
- a database (‘
School
’) - an EF data model on top of the
School
database - a form to display/edit these data
Here's a screenshot:
In this very simple app, the form then is the client of the data model.
The Data Connection
The connection string for the application's database is defined in the App.config. The EF connection string entirely contains the normal MS SQL Server connection string, so you may easily modify name and location of the underlying database (but – as pointed out above – you cannot change the type of the provider). The App.config looks like this:
="1.0"="utf-8"
<configuration>
<connectionStrings>
<add name="SchoolEntities"
connectionString="metadata=res://*/School.csdl|res://*/School.ssdl|
res://*/School.msl;provider=System.Data.SqlClient;provider connection string="
Data Source=<server instance name>;Initial Catalog=School;Integrated Security=True;
Pooling=False;MultipleActiveResultSets=True"" providerName="System.Data.EntityClient" />
</connectionStrings>
</configuration>
Adding Some ‘meat’ to the Target
In theory, this would be enough to demonstrate the approach: We could write UI tests against the form while stubbing out the database – in fact, this would be very close to what might happen in real development practice. But because UI tests are a quite demanding issue in itself and require at least some additional testing framework (and thus would introduce further complexity), I modified/extended the original solution to have a separate data access layer with a simple repository. This resulted in the assembly CourseManager.Data
, which looks like this:
As you can see, I added a PersonRepository
class, which acts as our sample client for the EF data model. This will be the class that we’ll write our tests against (Again: It could also be any other code which sits on top of an EF data model).
General Code and Fixture Layout
Entity Framework encapsulates its entire database stuff as well as all data entities from the conceptual model in one single class which derives from the ObjectContext base class. This is a horrible complex and far too long beast, and it's doing way too much different things (in other words: there's absolutely no Separation of concerns). The EF designer combines this base class with the conceptual model from our School
database and generates a SchoolEntities
class for us, which serves as our single access point to the database. Consequently, our sample repository class (PersonRepository
), which is there to encapsulate all the concrete data access code from the rest of the application, gets an instance of this SchoolEntities
class via its c'tor (This is an example for Dependency Injection, a coding technique which is generally very recommendable to write clean, maintainable code, and is invaluable if you want to design your code for testability.)
So the first part of our test fixture for the PersonRepository
class will be this (using Gallio):
[TestFixture, TestsOn(typeof(PersonRepository))]
[Metadata("MSDN Entity Framework Quickstart URL",
"http://msdn.microsoft.com/en-us/library/bb399182.aspx")]
[Description("Tests against the 'School' database as it is provided by MS. " +
"This expects the 'School' database from the quickstart sample.")]
public class PersonRepositoryFixture
{
#region Fields
private SchoolEntities _schoolContext;
private PersonRepository _personRepository;
#endregion // Fields
#region Setup/TearDown
[SetUp]
public void SetUp()
{
_schoolContext = new SchoolEntities();
_personRepository = new PersonRepository(_schoolContext);
}
...
Here, we declare an instance of the PersonRepository
class, and we initialize it with a SchoolEntities
object. By using a Setup method, we do this initalization for each single test. - Because the ObjectContext
is a highly stateful class (for example, it holds a cache with all the already loaded entities), and we want to make sure that our test outcomes are not influenced by previous operations.
And Finally: The First Test
To make things a little bit more interesting, I also extended the Person
entity to have a GetFullName()
method which returns the full name of the Person
according to the provided nameOrdering
argument:
public partial class Person
{
public string GetFullName(NameOrdering nameOrdering)
{
switch (nameOrdering)
{
case NameOrdering.Normal:
return String.Format("{0} {1}", _FirstName, _LastName);
case NameOrdering.List:
return String.Format("{0}, {1}", _LastName, _FirstName);
default:
throw new InvalidEnumArgumentException
("nameOrdering", (int)nameOrdering, typeof(NameOrdering));
}
}
}
Consequently, the PersonRepository
class has a corresponding method GetNameList()
, which returns an alphabetically sorted list of these Person
names:
public List<string> GetNameList(NameOrdering nameOrdering)
{
try
{
var fullNames = new List<string>(
_entities.People
.ToList()
.Select(@p => @p.GetFullName(nameOrdering)));
fullNames.Sort();
return fullNames;
}
catch (Exception exception)
{
throw new RepositoryException("Error getting person names.", exception);
}
}
And finally, the tests for GetNameList()
look like this then:
[Test, MultipleAsserts, TestsOn("PersonRepository.GetNameList")]
public void GetNameList_ListOrdering_ReturnsTheExpectedFullNames()
{
List<string> names =
_personRepository.GetNameList(NameOrdering.List);
Assert.Count(34, names);
Assert.AreEqual("Abercrombie, Kim", names.First());
Assert.AreEqual("Zheng, Roger", names.Last());
}
[Test, MultipleAsserts, TestsOn("PersonRepository.GetNameList")]
public void GetNameList_NormalOrdering_ReturnsTheExpectedFullNames()
{
List<string> names =
_personRepository.GetNameList(NameOrdering.Normal);
Assert.Count(34, names);
Assert.AreEqual("Alexandra Walker", names.First());
Assert.AreEqual("Yan Li", names.Last());
}
Nothing complicated or surprising here (in general, tests should always be that simple and 'trivial'): We just know what's in the database and check the repositories' corresponding list to correctly mirror this.
Conclusion
Ok, that's it for now. We've covered a lot of ground to set the stage for the forthcoming parts - maybe too much, but I didn't want to close the first part without presenting a single test.
This first part of the article series introduced the sample application that will be used as a target for our tests, discussed some general aspects of database-related testing, and also presented some of the more specific issues that you will have to handle when dealing with MS Entity Framework. The next two parts then will discuss the here outlined approaches with the Typemock and NDbUnit tools in some more detail, building on top of the here described CourseManager
sample application.
The Sample Solution
A sample solution (VS 2010) with the code from this article series is available via my Bitbucket account from here (Bitbucket is a hosting site for Mercurial repositories. The repositories may also be accessed with the Git and Subversion SCMs - consult the documentation for details. In addition, it is possible to download the solution simply as a zipped archive – via the 'get source' button on the very right.). The solution contains some more tests against the PersonRepository
class, which are not shown here. Also, it contains database scripts to create and fill the School
sample database.
To compile and run, the solution expects the Gallio/MbUnit framework to be installed (which is free and can be downloaded from here), the NDbUnit framework (which is also free and can be downloaded from here), and the Typemock Isolator tool. Moreover, you will need an instance of the Microsoft SQL Server DBMS, and you will have to adapt the connection strings in the test projects App.config files accordingly.