Introduction
Have you ever tried that to use the DbContext, the Entity Framework eqvilant of DbConnection of ADO.NET, from different threads? If you ever did then you must know all sort of alien and cryptic error messages thrown at you. All that you can understand from those messages is that you cannot easily use entity framework from multiple threads unless you take special care.
In this article I introduce a thread safe data manager that not only you can pass around your threads but it also gives you various other features that frees you from performing trivial tasks like creating repositories and IUnitOfWork implementations.
Background
Entity Framework is Mirosoft's answer to NHibernate, Telerik's DataAccess and various other Object Relation Mapping framework. Internally it uses the same IDbConnection, IDbCommand and others but give you an easy way to specify queries in C# code rather than Sql. You can use Entity Framework (EF) directly. Just instantiate the DbContext but it is recomended that you use Repository Patterns and IUnitOfWork.
In this pattern, you need to implement basic functions like GetAll(), Add(), Update(), Delete() for each of your entities, so that with these basic functions you can build complex functionality while keeping all the DB logic encapsulated inside the repositories. Interesting idea but what do you do if you have 30+ entities? The practice is create a separate repository for each of them by writing the same code again and again. A clever person might use Generic functions and stuff to reduce that task but the idea is the same, have a separate repository for each of your entity.
And what about multithreading? With every version increment of the framework, multithreading has become easier and easier to understand and use. and why not so that the task is completed faster. For the applications that are multithreaded and also rely on entity framework to store or retrieve some information to and from the database. Suppose you have an application that has to some electronically submitted form, extract information, save it not the database and read the next form. While at the same time, you need to scan the newly added records, extracts some info and notify some subscribers. You can use repositories from these threads? Yes and No.
Yes, if you create the repository inside the thread, use it and dispose it right there. No, you cannot use the repository of one thread in another thread. DbContext are not thread safe. if you try to do you will get all sort of exceptions that will keep you appreciating the knowledge of Dr. Google the whole day.
What is Generic Data Manager?
Generic Data Manager, an open source library, tries to provide solution to these problems. The idea is "Only concentrate on what to do, leave it to library on how to do". The library not only provides you the repository pattern implementation for all your entities but also provide thread safety in its own way. Disposing and other things are taken care of by the manager.
The manager implements IDataRepositoryProvider interface. This interface gives you repository for your entities. The manager is threadsafe but repositories it provides are not thread safe. You can toss the manager around amoung the threads, they request for a repository. The manager creates a DbContext specifc to that thread, creates a repository that uses that DbContext and returns it. Manager keeps a list of DbContext, its associated repositories and threads for which they are associated with. If there is another request for a repository from the thread for which a DbContext already exists, manager simply creates a repository, binds it to that existing context and returns it. You use the repository and once you are done, call the dispose on the repository. Even if you forget to call dispose, the manager will take care of it, provided that you are using an execution policy that is specific to that situation.
Connection Parameters
This is a structure that data manager requires to connect to the database, build DbContexts and bind the repositories for your entities. It has three members:
- Connection String: A simple old connection string with no Entity mnodel info so that it is easier to provide the information
- Model name: this is the name you have for your model, do not confuse it with the model file name.
- Delay, this is the time span that specifies the delay. This delay is used for various internal wait and sleep functions. If you specify none, default is 30 seconds
- Provider name, which is the provider name for the database you are using, default is "System.Data.SqlClient" i.e. Sql Server
This sturcture resides in GenericDataManager.Common namespace. This is one of the two mandatory parameters that are required. Think of this as your model, and database information.
What is an Execution Policy?
The library provides 2 data managers.
- Legacy Data Manager
- Data Manager with Policy ( It is recomended that you use this manager for your programming needs )
The Legacy Data manager was there from its inception. When I created the library initially, that was all there was inside the library with not much customization. It is still there for backward compatability. Currently the library version 2 is out in beta phase but it is pretty much stable. The new version has Data manager with policy, an implementation that offers a lot of customization and that customization is done using the Execution Policy.
Most of the things that execution policy control affect the behaviour of data manager when multiple threads are involved. These are but not limited to
- How the DbContext is associated with its thread
- How long a Repository should be considered in use.
- What is the best time to close the DbContext and destroy its associated repositories,
- How often the cleaner (a demon thread, consider it as the garbage cleaner of data manager) should be invoked.
- How should cleaner perform its task
For single threaded application, you do not need much customization of execution policy. just instantiate the default one and pass it. It does not require any extra parameters or other information.
What is a Cleaner?
Internally the Data manager keeps a thread safe list where it keeps information about the threads, its DbContexts and the reositories associated with it. This is called context map. This context map is the code of the data manager whether legacy or policy based. Cleaner is a background thread whose job is to perform cleanup of the context map and remove the dbcontexts and repositories that are not needed. To understand it fully you need to understand the working on data manager.
When you request a repository for a certain entity, the data manager checks the context map to see if the thread from which you are calling is already there. If the thread is there, it checks if the DbContext it has is active and scans the list of repositories it has. if the repository for the entity you requested is there, it would simply return the repository. You use it and dispose it.
If the thread from which you made the request is not there data manager creates a DbContext on that thread and records it in the context map along with the thread. Since the DbContext is newly created, it creates the repository, associate it with the DbContext (which means there is a list of repositories that is linked to each DbContext) and return it to you. You use it and either dispose it or maybe forget to dispose it. We will see later that this case is also handled easily.
When you create the data manager with policy, it also creates a background thread whose sole job is to scan the context map from time to time and get rid of unused dbcontexts and repositories to keep it clean. At specified intervals, the cleaner scans the context map, collecting the unused dbcontexts. Once all are marked, they are removed from the context map. The data manager keeps on working with out any disruption while the cleaner disposes the connections one by one separately.
The Execution policy has various parameters and strategies that affect the working of cleaner. For instance there is a strategy that specifies that once a thread is dead, there is no point of keeping the DbContext alive so once the thread finishes, the data manager closes its Dbcontext and disposes all its repositories even if you forget to dispose any of them at all. But a word of caution here, do not use this stragery to make up for bad programmign habits.
We will discuss the execution policy and cleaner in detail in next article where we make use of multithreading.
Using the Data Manager
Prerequisette
Using the code is extremely easy. I'm making a few assumptions here
- You are using Visual Studio 2012 (least version). Community edition will do fine.
- You are using Sql Server 2005 (least possible version) Again express edition is just fine.
- You have basic knowledge of Entity Framework
- You have basic knowledge of multithreading.
- You have basic knowledge of Reporsitory and IUnitOfWork.
The Database along with Data
I will follow the database first approach. All it means that I will create my database first and create my model based on it. To save on the length of this article, please run the SampleDatabaseScript.sql. It will create the necessary tables and put in some sample data too so you would have something to start with quickly. Provided that your database was empty and everything executed fine, here's how your database would look like after executing the script:
Entity Framework Model
Now we will create a project that will contain our entity framework model based on the database we created earlier. Fire up the Visual studio, choose new project, class library type. You can choose any name you like, for the sake of this article, I would name mine "Common Model". Class Library and you would have a skeleton application. Delete the Class1.cs file. We are not going to need it. Here's how your project would look like at this stage.
We need to add a model to it that refers to our database we just created earlier. Go to menu Project and choose Add New Item ... Add the new entity model with the following settings,
- Model name: MyModel
- EF Designer from Database
- Entity Framework 6.0
- Model Namespace: CommonModel
- Choose Tables: Department, Employee
Delete MyModel.Context.tt, we are not going to need it. Generic data manager will create it dynamically. Build it now and if you have come this far without any errors, This is how it should look like.
Let's move on to the next step.
Simple Application
We will now create a simple application, single threaded off course, to use the Generic Data Manager. We have some preliminary or rather some necessary steps like creating an app, addign nuget packages, add reference to our entity model.
- Right click the solution, Add | New Project... and choose Console application.
- To add generic data manager, right click your console application, choose Manage Nuget Packages ... go to Browse tab and search for Generic Data Manager. It should be listed, choose it. (You do not need to install entity framework package separately. It will be installed automatically once you install Generic Data Manager) You can also install it using the package manager console and using command: Install-Package GenericDataManager -Pre
- Make your console application as Startup Project by right clicking and choosing "Set as Startup Project". You will need to reference your project that contains the entity model in your console application.
- Build your solution to verify that everything works so far.
All those steps were prepration. Now that everything has setup, we are ready to use the Generic Data Manager and see what it holds for us.
Using the Generic Data Manager
Using the data manager is very simple and easy. First thing first, we now add the necessary references.
using GenericDataManager;
using GenericDataManager.Common;
using GenericDataManager.Interfaces;
To use the data manager you need four things
- simple connection string
- your model name
- provider name (Sql Server i.e. System.Data.SqlClient is the default)
- execution policy
If you have ever used the connection string for entity framework, they seem to be extremely complex, not only they provide information about connection string and the provider, they also tell you about the entity model files. Generic Data Manager (GDM) free you from all that. All it needs a simple connection string, a provider name and the model name. That what we are going to do. Instantiate it and provide the necessary information. In the main() add the following lines:
var connection = new ConnectionParameters(
"data source=<yourSource>;initial catalog=<your DB>;integrated security=True;",
"MyModel");
var policy = new ExecutionPolicy();
IDataRepositoryProvider manager = new GenericDataManager.DataManagerWithPolicy(connection, policy);
To keep things simple, we will use the default execution policy. In generic data manager, interfaces are everything. Here's something important Most of the times the interfaces are implemented explicitly so unless you cast your object to proper interface, you will not see the necessary functions. Now that you have your data manager i.e. IDataRepositoryProvider ready, lets use it.
using(var repository = manager.Repository.Get <Common_Model.Employee>())
{
foreach (var emp in repository.All(x=>x.FullName.StartsWith("E")))
Console.WriteLine("{0} {1}", emp.FullName, emp.Email);
}
We requested a reporitory for our entity Employee which resides in Common_Model namespace. The IDataRepositoryProvider has a Repository property which actually gives you what kind of repository you want and for what entity you need it. The repository will give you functions like:
- One: to retrieve a single record, based on lambda expression
- All: to retrieve a list of records , based on lamda expression or all if you specify none
- Count: counts the records, provide lambda and it counts only records that fit the criteria
- Exists: checks if there are any records that satisfy the condition
- Add: bulk and lambda operations are supported
- Update: bulk and lambda operations are supported
- Delete: bulk and lambda operations are supported
So you do not have to write much, like mentioned earlier: concenrate on what you want to do rather than how to do. With these basic functions, you can make complex ones. We are only retrieving employees whose name start with letter E and display their names and email ids.
In the next article, we will discuss how it can be used for multithreading and use a custom execution policy
Source, Documentation and Nuget Package
Let me know your comments and views. Your comments will help me improve the library.