The other day at work, I had a task that required me to do a bulk insert of data into a SQL server database table. I have obliviously come across (and used in the past) the bcp.exe command line utility. Which is all well and good when you are wanting to run scripts, etc.
This time however, I wanted to do the bulk insert programmatically using some standard .NET code. As I say, this is not something I have had to do in code before. So I set out to find out how to do this, and after a few minutes of Googling, found the answer I was looking for, which is the:
This class has been available in .NET since v2.0, I guess if you don’t need these things, they sometimes slip you by, which is the case here, for me anyway!
The main method that you would use in this class are the WriteToServer(..)
where there are a few overloads that make use of DataTable
/DataRow[]
and IDataReader
.
WriteToServer(DataRow[])
WriteToServer(DataTable)
WriteToServer(IDataReader)
WriteToServer(DataTable, DataRowState)
WriteToServerAsync(DataRow[])
WriteToServerAsync(DataTable)
WriteToServerAsync(IDataReader)
WriteToServerAsync(DataRow[], CancellationToken)
WriteToServerAsync(DataTable, DataRowState)
WriteToServerAsync(DataTable, CancellationToken)
WriteToServerAsync(IDataReader, CancellationToken)
WriteToServerAsync(IDataReader, CancellationToken)
You generally want to make use of the methods above that make use of IDataReader
, this is because DataReader
is a forward-only, read-only stream. It does not hold the data and thus is much faster than DataTable
and DataRows[]
The scenario I was trying to deal with was how to do bulk inserts, and I came across this very good post by Mike Goatly, which goes into a lot of detail:
And there was also this one over at Codeproject by AzamSharp:
Azam demonstrates how to do use the SqlBulkCopy
to do a bulk copy, so if that is what you are after, check out his article. My scenario was that I wanted to do a bulk insert, luckily this was exactly what Mike Goatly writes about in his post which I listed above.
Bulk Insert
The trick to this is to using the SqlBulkCopy
to do a bulk insert, we need to create a custom IDataReader
. This would be a cinch if we could do something like ObjectDataReader<SomeObject>
and use that to feed WriteToServer()
with a set of objects.
Unfortunately, this doesn’t exist, so you’re going to have to implement your own.
public interface IDataReader : IDisposable, IDataRecord
{
int Depth { get; }
bool IsClosed { get; }
int RecordsAffected { get; }
void Close();
DataTable GetSchemaTable();
bool NextResult();
bool Read();
}
Mike Goatley gives us a working implementation of this, which is as follows:
namespace SqlBulkCopyExample
{
using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
public class ObjectDataReader<TData> : IDataReader
{
private IEnumerator<TData> dataEnumerator;
private Func<TData, object>[] accessors;
private Dictionary<string, int> ordinalLookup;
public ObjectDataReader(IEnumerable<TData> data)
{
this.dataEnumerator = data.GetEnumerator();
var propertyAccessors = typeof(TData)
.GetProperties(BindingFlags.Instance | BindingFlags.Public)
.Where(p => p.CanRead)
.Select((p, i) => new
{
Index = i,
Property = p,
Accessor = CreatePropertyAccessor(p)
})
.ToArray();
this.accessors = propertyAccessors.Select(p => p.Accessor).ToArray();
this.ordinalLookup = propertyAccessors.ToDictionary(
p => p.Property.Name,
p => p.Index,
StringComparer.OrdinalIgnoreCase);
}
private Func<TData, object> CreatePropertyAccessor(PropertyInfo p)
{
var parameter = Expression.Parameter(typeof(TData), "input");
var propertyAccess = Expression.Property(parameter, p.GetGetMethod());
var castAsObject = Expression.TypeAs(propertyAccess, typeof(object));
var lamda = Expression.Lambda<Func<TData, object>>(castAsObject, parameter);
return lamda.Compile();
}
#region IDataReader Members
public void Close()
{
this.Dispose();
}
public int Depth
{
get { return 1; }
}
public DataTable GetSchemaTable()
{
return null;
}
public bool IsClosed
{
get { return this.dataEnumerator == null; }
}
public bool NextResult()
{
return false;
}
public bool Read()
{
if (this.dataEnumerator == null)
{
throw new ObjectDisposedException("ObjectDataReader");
}
return this.dataEnumerator.MoveNext();
}
public int RecordsAffected
{
get { return -1; }
}
#endregion
#region IDisposable Members
public void Dispose()
{
this.Dispose(true);
GC.SuppressFinalize(this);
}
protected void Dispose(bool disposing)
{
if (disposing)
{
if (this.dataEnumerator != null)
{
this.dataEnumerator.Dispose();
this.dataEnumerator = null;
}
}
}
#endregion
#region IDataRecord Members
public int FieldCount
{
get { return this.accessors.Length; }
}
public bool GetBoolean(int i)
{
throw new NotImplementedException();
}
public byte GetByte(int i)
{
throw new NotImplementedException();
}
public long GetBytes(int i, long fieldOffset, byte[] buffer, int bufferoffset, int length)
{
throw new NotImplementedException();
}
public char GetChar(int i)
{
throw new NotImplementedException();
}
public long GetChars(int i, long fieldoffset, char[] buffer, int bufferoffset, int length)
{
throw new NotImplementedException();
}
public IDataReader GetData(int i)
{
throw new NotImplementedException();
}
public string GetDataTypeName(int i)
{
throw new NotImplementedException();
}
public DateTime GetDateTime(int i)
{
throw new NotImplementedException();
}
public decimal GetDecimal(int i)
{
throw new NotImplementedException();
}
public double GetDouble(int i)
{
throw new NotImplementedException();
}
public Type GetFieldType(int i)
{
throw new NotImplementedException();
}
public float GetFloat(int i)
{
throw new NotImplementedException();
}
public Guid GetGuid(int i)
{
throw new NotImplementedException();
}
public short GetInt16(int i)
{
throw new NotImplementedException();
}
public int GetInt32(int i)
{
throw new NotImplementedException();
}
public long GetInt64(int i)
{
throw new NotImplementedException();
}
public string GetName(int i)
{
throw new NotImplementedException();
}
public int GetOrdinal(string name)
{
int ordinal;
if (!this.ordinalLookup.TryGetValue(name, out ordinal))
{
throw new InvalidOperationException("Unknown parameter name " + name);
}
return ordinal;
}
public string GetString(int i)
{
throw new NotImplementedException();
}
public object GetValue(int i)
{
if (this.dataEnumerator == null)
{
throw new ObjectDisposedException("ObjectDataReader");
}
return this.accessors[i](this.dataEnumerator.Current);
}
public int GetValues(object[] values)
{
throw new NotImplementedException();
}
public bool IsDBNull(int i)
{
throw new NotImplementedException();
}
public object this[string name]
{
get { throw new NotImplementedException(); }
}
public object this[int i]
{
get { throw new NotImplementedException(); }
}
#endregion
}
}
With this very useful code that Mike provides, all we need to do is something like this to bulk insert using a IDataReader
using the SqlBulkCopy
class:
namespace SqlBulkCopyExample
{
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Diagnostics;
using System.Linq;
using SqlBulkCopyExample.Properties;
class Program
{
static void Main(string[] args)
{
var people = CreateSamplePeople(10000);
using (var connection = new SqlConnection(
"Server=.;Database=MostWanted;Integrated Security=SSPI"))
{
connection.Open();
InsertDataUsingSqlBulkCopy(people, connection);
}
}
private static void InsertDataUsingSqlBulkCopy(
IEnumerable<Person> people, SqlConnection connection)
{
var bulkCopy = new SqlBulkCopy(connection);
bulkCopy.DestinationTableName = "Person";
bulkCopy.ColumnMappings.Add("Name", "Name");
bulkCopy.ColumnMappings.Add("DateOfBirth", "DateOfBirth");
using (var dataReader = new ObjectDataReader<Person>(people))
{
bulkCopy.WriteToServer(dataReader);
}
}
private static IEnumerable<Person> CreateSamplePeople(int count)
{
return Enumerable.Range(0, count)
.Select(i => new Person
{
Name = "Person" + i,
DateOfBirth = new DateTime(
1950 + (i % 50),
((i * 3) % 12) + 1,
((i * 7) % 29) + 1)
});
}
}
}
I grabbed the bulk of this code from Mike's original post, where he does a much more thorough job of explaining things and has a nice little demo project in which you can compare the difference between using standard 1 by 1 inserts and using this approach, the difference is huge.
Happy days, thanks Mike, certainly made my day a lot easier.