Introduction
SQLBulkCopy is a really useful and fast way to insert large amounts of data in
a few seconds but it only accepts DataTable
to insert into tables.
Managing DataTable
s is very problematic, we have to specify columns, rows manually with all string values. If we miss something then all
is screwed up and there is no support for any basic validation.
In this tutorial, I am going to create some classes which will act as a API through which we can take advantage of SqlBulkCopy
with strongly typed objects and support of basic validation with the help of data annotation.
Custom attribute classes
At first we are going to create custom attributes. Through these, we will be able to specify column names and table name in a model itself.
DataTableAttribute
implementation for specifying table name.
[AttributeUsage(AttributeTargets.Class)]
public class DataTableAttribute : Attribute
{
public DataTableAttribute(string containerName)
: this()
{
TableName = containerName;
}
private DataTableAttribute()
{
TableName = string.Empty;
}
public string TableName { get; set; }
}
DataColumnAttribute
implementation for specifying column names.
[AttributeUsage(AttributeTargets.Property)]
public class DataColumnAttribute : Attribute
{
public DataColumnAttribute(string name, Type type)
: this()
{
ColumnName = name;
ColumnType = type;
}
private DataColumnAttribute()
{
ColumnName = string.Empty;
ColumnType = null;
}
public string ColumnName { get; set; }
public Type ColumnType { get; set; }
}
You might be wondering why do we need to have these custom attributes rather we can get table name, column name and data types from class itself.
The advantage of this approach is we can specify more properties which does not take part in DB operation or have a different name in DB and on code level.
Data Model
In second phase, let decorate our DB model class with above custom attributes and data annotation attributes for validation. Data annotation attributes could be skipped
if you do not require any validation for bulk insert.
Note: You need add System.ComponentModel.DataAnnotations.dll
. If planning for validation.
Data model for bulk insert:
[DataTable("TBookInfo")]
public class BookInfo
{
[DataColumn("ISBN", typeof(string))]
[Required(ErrorMessage = "EAN is required")]
[MaxLength(50, ErrorMessage = "EAN is exceeded max length of 50")]
public string EAN { get; set; }
[DataColumn("Title", typeof(string))]
[Required(ErrorMessage = "Book title is required")]
[MaxLength(255, ErrorMessage = "Book title is exceeded max length of 255")]
public string Title { get; set; }
[DataColumn("PublisherName", typeof(string))]
[Required(ErrorMessage = "Publisher name is required")]
[MaxLength(255, ErrorMessage = "Publisher name is exceeded max length of 255")]
public string PublisherName { get; set; }
[DataColumn("Synopsis", typeof(string))]
public string Synopsis { get; set; }
[DataColumn("ReleaseDate", typeof(DateTime))]
public DateTime? ReleaseDate { get; set; }
[DataColumn("NumberOfPages", typeof(int))]
public int? NumberOfPages { get; set; }
}
DataColumn
and DataTable
is our custom defined attributes which would be used later. Required
and MaxLength
attributes
came from data annotation attributes.
Now our model is decorated with required attributes which would take part in SqlBulkCopy
.
Data model conversion implementation
Let's go into the implementation part for automatic conversion from above class to DataTable
. Function implementation for getting table name from object:
public static String GetTableNameFromObject<T>()
{
var arrTableAttrib = typeof(T).GetCustomAttributes(typeof(DataTableAttribute), true);
return arrTableAttrib.Any() ? ((DataTableAttribute)arrTableAttrib.First()).TableName : null;
}
So, we have got function for getting table name. Let's write a function to get PropertyInfo
which will help us to retrieve property values through reflection
and DataColumnAttribute
for getting data column name with it's type.
NOTE: I am using simple caching mechanism to store our PropertyInfo
and DataColumnAttribute
for fast retrieval. You need to add System.Runtime.Caching.dll
for using caching.
Let's look into code for getting those two values
public static Dictionary<PropertyInfo, DataColumnAttribute> GetAllCustomProperty<T>()
{
if (MemoryCache.Default[typeof(T).ToString()] == null)
{
var props = (from prop in typeof(T).GetProperties()
where prop.GetCustomAttributes(typeof(DataColumnAttribute), true).Any()
select new
{
prop,
dataColumn = (DataColumnAttribute)
prop.GetCustomAttributes(typeof(DataColumnAttribute), true).FirstOrDefault()
}).ToDictionary(item => item.prop, item => item.dataColumn);
MemoryCache.Default.Set(typeof(T).ToString(), props, null);
return props;
}
return (Dictionary<PropertyInfo, DataColumnAttribute>)MemoryCache.Default[typeof(T).ToString()];
}
I have simplified things with above mentioned single function and LINQ queries which are easier to understand.
DataTable
accepts arrays of DataColumns
. So, let's declare a method to get DataColumn
arrays which would be used more then once and this
will be using above function.
public static DataColumn[] GetAllDataTableColumn<T>()
{
return (from item in GetAllCustomProperty<T>()
select new DataColumn(item.Value.ColumnName, item.Value.ColumnType))
.ToArray();
}
Now, we need to have a function that should return property value by passing value of custom column attribute.
public static PropertyInfo GetPropertyName<T>(string dataColumnAttributeName)
{
return (from item in GetAllCustomProperty<T>()
where item.Value.ColumnName == dataColumnAttributeName
select item.Key).FirstOrDefault();
}
The above function is again using same GetAllCustomProperty()
for performance benefit.
All things are set now. Now, we need to create function that would parse object to DataTable
to operate on SqlBulkCopy
.
public static DataTable GetTransformedData<T>(IList<T> myDataObjectList)
{
var dataTable = new DataTable(GetTableNameFromObject<T>());
var allColumns = GetAllDataTableColumn<T>();
dataTable.Columns.AddRange(allColumns);
foreach (T record in myDataObjectList)
{
var row = dataTable.NewRow();
foreach (var colmnAttrib in allColumns)
{
row[colmnAttrib] = GetPropertyName<T>(
colmnAttrib.ColumnName).GetValue(record) ?? DBNull.Value;
}
dataTable.Rows.Add(row);
}
return dataTable;
}
Validation
We have defined custom attributes classes in first phase and in second decorated our model class. Third phase codes deals with conversion of model to DataTable.
Let's suppose you are getting values from some source and saving it to lstBooks
object.
Code for validating lstBooks
and removing items which are having invalid rows:
private static void ValidateData(ref List<BookInfo> lstBookInfoes)
{
var results = new List<ValidationResult>();
lstBookInfoes.RemoveAll(bookInfo =>
!Validator.TryValidateObject(
bookInfo,
new ValidationContext(bookInfo, null, null),
results,
true));
}
Note: If we need some complex logic for validation, then inherit IValidatableObject
interface and implement
IEnumerable<validationresult>
Validate(ValidationContext validationContext)
on model class. In this method we can have any custom implementation.
Implementation of core code for doing SqlBulkInsert
var lstBooks = GetSampleBookInfo();
ValidateData(ref lstBooks);
using (var con = new SqlConnection(
ConfigurationManager.ConnectionStrings["BulkInsertDemo"].ConnectionString))
{
using (var bulkCopy = new SqlBulkCopy(con))
{
con.Open();
var dt = DataTableConverter.GetTransformedData(lstBooks);
bulkCopy.DestinationTableName =
DataTableConverter.GetTableNameFromObject<BookInfo>();
bulkCopy.BatchSize = 5000;
bulkCopy.WriteToServer(dt);
}
}
Transformation is done in a generic way. So, whenever you need to do bulk insert, create a class with the required attributes and everything else would be taken care of.
It is really fast while getting value through reflection as it is using caching mechanism. So, very little or no performance overhead.
Note: In some cases DataTable
is having issue related to ordering. You can change your class members order or table column orders. DataColumnAttribute
class
can be also modified to have ordering sequence model class.
Note: From demo download please modify actual path of DB file from
app.config, if DB connection is failing.