Introduction
There are a lot of articles and posts about: "How to increase performance of inserting new entities at case of Entity Framework?". Often, answers contain such recommendations as: to play with Configuration.AutoDetectChangesEnabled
, divide your data to portions and use for each of them individual context, disposing it, and a lot of other approaches. For example, in case of MS SQL, I use a very cool library BulkInsert
: https://efbulkinsert.codeplex.com/, but what we will do, if we use Oracle database? I will try to answer this question in this article.
What to Install
Fortunately, there is a project, forked from mentioned above BulkInsert
library: https://github.com/Thorium/EntityFramework.BulkInsert, with its help, we will solve our main problem - implementing BulkInsert
approach to increase insert performance in case of Oracle database.
First of all, you should download its source code (the most important for us is DevartProvider
class, located in Providers folder) and include the corresponding assembly in your project. Then, you need to load some external packages with the help of Nuget. They are:
I will explain for which reasons each package is needed:
EntityFramework.MappingAPI
for BulkInsert
inner stuff. Simple.Data.Oracle
and dependent Simple.Data
. packages contain Oracle provider which BulkInsert
assembly will use. Oracle.ManagedDataAccess
. packages necessity will be described lately. EntityFramework
is the core package, which is indeed always needed.
Solution
Contexts
Now, let's create our contexts:
internal class OracleContext : DbContext
{
public DbSet<Customer> Customers { get; set; }
public DbSet<Order> Orders { get; set; }
}
internal class OracleContextBulkInsert : OracleContext
{
public OracleContextInsert()
{
EntityFrameworkContextExtensions.RegisterDevartBulkInsertProvider(this);
}
}
But there is one restriction: OracleContextBulkInsert
approach is only applicable for performing BulkInsert
operation. If we will try to perform one of the common EF operations, for example, to get some data, an exception will be thrown. This is the point, where Oracle.ManagedDataAccess.
packages mentioned above come out to scene. Actually, we will use two contexts: OracleContext
and OracleContextBulkInsert
. They will use providers from Oracle.ManagedDataAccess
. packages and Simple.Data
packages accordingly. You can think about this as something familiar to the CQRS approach: OracleContext
for common operations and OracleContextBulkInsert
for highly effective inserts.
Repository
I think it is a good idea to implement the repository pattern to decorate and hide our two contexts, because, who wants to know about them and inner logic, that stay behind? That is why, context classes have internal
access modifier.
public interface IOracleRepository
{
void Insert<TContext>(IEnumerable<TContext> entities) where TContext : class;
List<TDto> GetList<TContext, TDto>(Expression<Func<TContext, bool>> predicate,
Expression<Func<TContext, TDto>> selector) where TContext : class;
}
public class OracleRepository : IOracleRepository
{
private OracleContext CreateContext(bool isSelect = true)
{
return isSelect ? new OracleContext() : new OracleContextBulkInsert();
}
private DbSet<T> Table<T>(OracleContextBase context) where T : class
{
return context.Set<T>();
}
public List<TDto> GetList<TContext, TDto>(Expression<Func<TContext, bool>> predicate,
Expression<Func<TContext, TDto>> selector) where TContext : class
{
using (var context = CreateContext())
{
return Table<TContext>(context).Where(predicate).Select(selector).ToList();
}
}
public void Insert<TContext>(IEnumerable<TContext> entities) where TContext : class
{
using (var context = CreateContext(false))
{
context.BulkInsert(entities);
}
}
}
As you can see, it is quite a usual implementation of repository pattern, except for one important thing - CreateContext
method's result depends on inside which method it is called.
Usage
As I already said, we won't use our context directly, but through repository, not worrying about inner implemented logic. This is a simple example:
var repo = new OracleRepository();
var sales = repo.GetList<Sale, SalesDto>(x => x.Amount > 100,
x => new SalesDto { Id = x.Id, Amount = x.Amount});
repo.Insert(new List<Sales> {
new Sale{ Amount = 200 },
new Sale{ Amount = 300 }
});
App.config
<configuration>
<!-- other stuff -->
<appSettings>
<add key="InnerDbProviderName" value="Devart.Data.Oracle" />
</appSettings>
<connectionStrings>
<add name="OracleContextBulkInsert" connectionString="your connection string"
providerName="Devart.Data.Oracle" />
<add name="OracleContext" connectionString="your connection string"
providerName="Oracle.ManagedDataAccess.Client"/>
</connectionStrings>
</configuration>
There are two very important things:
- You should specify:
<add key="InnerDbProviderName" value="Devart.Data.Oracle"/>
- it is a requirement of BulkInsert
project. - Now you have two connection strings, because of two contexts and two corresponding providers, as you can see in App.config (of course, connection strings must be the same and I hope that it is not very hard for you to specify it twice).
Issues
If you use such web diagnostic platforms as Glimpse: http://getglimpse.com/, there will be a problem with its compatibility against BulkInsert
library (exception: BulkInsertProviderNotFoundException
will be thrown). This problem is fixed in case of MS SQL here, but, unfortunately, there is no existing solution for Oracle database, so you should refuse to use Glimpse or fix this problem by yourself. On the other hand, you can use MiniProfiler
(http://miniprofiler.com/) as alternative, it is fully compatible with our situation.
Also bitness of your Oracle Client (that should be installed on target PC) should be the same as your application. For example, in case of IIS, you can tune one option for app pool - allow 32 bit applications to true value, it will fix the problem, if you have x86 Oracle Client, but your application has x64 bitness.
Arrays in a Bind Solution (Without Entity Framework)
If you don't want to have a deal with BulkInsert
or you do not use Entity Framework, there is another solution, that provides results that are good enough. You can find the full description here. The main idea is that you are creating a very simple query like this:
insert into mytable(columnName1, columnName2, ..) values(:columnName1, :columnName2, ...)
where :columnName1, :columnName2
are parameters, but instead of being simple atomic values, they will be arrays. That is all, but you should also assign property ArrayBindCount
at OracleCommand
variable to the corresponding value. Full solution, that was made, based on code from the link above with adding some modifications: reflection and generic approaches, will look like this:
private static Dictionary<Type, OracleDbType> dotNet2Oracle { get; set; } =
new Dictionary<Type, OracleDbType> {
[typeof(string)] = OracleDbType.Varchar2,
[typeof(int)] = OracleDbType.Int32,
[typeof(DateTime)] = OracleDbType.Date,
[typeof(decimal)] = OracleDbType.Decimal,
};
public void BatchInsert<TContext>(IEnumerable<TContext> entities) where TContext : class
{
if (entities.Count() == 0)
return;
using (var con = new OracleConnection
(ConfigurationManager.ConnectionStrings["OracleContextInsert"].ConnectionString))
{
con.Open();
var cmd = con.CreateCommand();
var type = entities.GetType().GetGenericArguments()[0];
var tableName = (type.GetCustomAttributes
(typeof(TableAttribute), false).FirstOrDefault() as TableAttribute)?.Name ?? type.Name;
cmd.CommandText = $@"insert into {tableName} ({type.GetProperties().Select(x => x.Name)
.Aggregate((a, b) => $"{a}, {b}")}) values
({type.GetProperties().Select(x => $":{x.Name}")
.Aggregate((a, b) => $"{a}, {b}")})";
cmd.ArrayBindCount = entities.Count();
var oracleTypesType = typeof(OracleDbType);
foreach (var prop in type.GetProperties())
{
var param = new OracleParameter();
param.ParameterName = prop.Name;
var customType = (prop.GetCustomAttributes
(typeof(ColumnAttribute), false).FirstOrDefault() as ColumnAttribute)?.TypeName;
if (customType != null)
{
OracleDbType enumVal;
if (Enum.TryParse(customType, out enumVal))
param.OracleDbType = enumVal;
else
throw new Exception($@"Custom column's type ({customType}) has
no corresponding projection at OracleDbType!");
}
else
{
var targetType = prop.PropertyType
.GetGenericArguments().FirstOrDefault() ?? prop.PropertyType;
if (!dotNet2Oracle.ContainsKey(targetType))
throw new Exception($@"Corresponding Oracle type for this .Net type
({targetType.Name}) not founded,
add necessary pair to dotNet2Oracle dictionary!");
param.OracleDbType = dotNet2Oracle[targetType];
}
param.Value = entities.Select(x => type.GetProperty(prop.Name).GetValue(x)).ToArray();
cmd.Parameters.Add(param);
}
cmd.ExecuteNonQuery();
}
}
By default, table's name will be the same as the name of TContext
type, to change it, just add TableAttribute
, specifying desired value to Name
property, columns names are properties names. dotNet2Oracle
dictionary contains type mappings between .NET and Oracle types, you can simply add other necessary values to it, Also, it is possible to specify Oracle type directly via ColumnAttribute
and its property TypeName
. For this solution, you should install only item 3. from BulkInsert
list. Performance comparison - this approach is about 1.5 times worse than BulkInsert
, but has no issues, described in the previous section.
Conclusion
In this article, I have shown you how to implement BulkInsert
approach with Oracle database with the help of library: https://github.com/Thorium/EntityFramework.BulkInsert. One can use this solution in case of code-first, model-first or database-first approaches, of course, some changes and additional stuff may be needed in case of concrete variant, but the core approach will remain the same. Also, you can use Arrays in a Bind
approach (if you do not use Entity Framework), that is a more simple, but less effective solution.