I was recently tasked with a project at a company to update an SQL Server 2008 database with large amounts of data each day. The task at first seemed daunting due to the files exceeding well over 400,000 records and there were several that needed processing daily. I first tried LINQ to SQL, but with the amount of data, the inserts were slow performing to say the least. Then I remembered the SqlBulkCopy
class. SqlBulkCopy
lets you efficiently bulk load a SQL Server table with data from another source. The SqlBulkCopy
class can be used to write data only to SQL Server tables. However, the data source is not limited to SQL Server; any data source can be used, as long as the data can be loaded to a DataTable
instance or read with a IDataReader
instance. For this example the file will contain roughly 1000 records, but this code can handle large amounts of data.
To begin with let’s create a table in SQL Server that will hold the data. Copy the following T-SQL into SQL Server to create your table:
CREATE TABLE [dbo].[Censis](
[Suburb] [varchar](200) NULL,
[NotStated] [int] NULL,
[NotApplicable] [int] NULL,
[Fishing] [int] NULL,
[Mining] [int] NULL,
[Manufacturing] [int] NULL,
[Electricity] [int] NULL,
[Construction] [int] NULL
) ON [PRIMARY]
GO
The table above will hold Census data that is freely available to download in Australia.
The next item to do is create a console application that will bulk load the data. Open Visual Studio 2008 and choose File > New > Windows > Console Application.
Before moving on, to explain the code I have to work backwards and explain the final method that bulk loads data. SqlBulkCopy
has a method called WriteToServer
. One of the overloads of this method takes a DataTable
as the parameter. Because a DataTable
contains rows and columns, this seemed like a logical choice for the task I was facing.
Jumping back to the example we now know we need to create a DataTable
that contains the information from the text file. The code below demonstrates how to do this:
DataTable dt = new DataTable();
string line = null;
int i = 0;
using (StreamReader sr = File.OpenText(@"c:\temp\table1.csv"))
{
while ((line = sr.ReadLine()) != null)
{
string[] data = line.Split(',');
if (data.Length > 0)
{
if (i == 0)
{
foreach (var item in data)
{
dt.Columns.Add(new DataColumn());
}
i++;
}
DataRow row = dt.NewRow();
row.ItemArray = data;
dt.Rows.Add(row);
}
}
}
VB.NET
Dim dt As New DataTable()
Dim line As String = Nothing
Dim i As Integer = 0
Using sr As StreamReader = File.OpenText("c:\temp\table1.csv")
line = sr.ReadLine()
Do While line IsNot Nothing
Dim data() As String = line.Split(","c)
If data.Length > 0 Then
If i = 0 Then
For Each item In data
dt.Columns.Add(New DataColumn())
Next item
i += 1
End If
Dim row As DataRow = dt.NewRow()
row.ItemArray = data
dt.Rows.Add(row)
End If
line = sr.ReadLine()
Loop
End Using
In the code above, I created a DataTable
that will store all the information from the csv file. The CSV file resides in the C:\Temp directory. I am using a StreamReader
object to open the file and read each line in the file. Each line is then split up into a string array. That string array will be assigned to each DataRow
as the ItemArray
value. This sets the values for the row through the array.
When the file has been read, the next thing to do is use the SqlBulkCopy
class to insert the data into SQL Server. The following code demonstrates how to do this:
using (SqlConnection cn = new SqlConnection(ConfigurationManager.ConnectionStrings[
"ConsoleApplication3.Properties.Settings.daasConnectionString"].ConnectionString))
{
cn.Open();
using (SqlBulkCopy copy = new SqlBulkCopy(cn))
{
copy.ColumnMappings.Add(0, 0);
copy.ColumnMappings.Add(1, 1);
copy.ColumnMappings.Add(2, 2);
copy.ColumnMappings.Add(3, 3);
copy.ColumnMappings.Add(4, 4);
copy.DestinationTableName = "Censis";
copy.WriteToServer(dt);
}
}
VB.NET
Using cn As New SqlConnection(ConfigurationManager.ConnectionStrings(_
"ConsoleApplication3.Properties.Settings.daasConnectionString").ConnectionString)
cn.Open()
Using copy As New SqlBulkCopy(cn)
copy.ColumnMappings.Add(0, 0)
copy.ColumnMappings.Add(1, 1)
copy.ColumnMappings.Add(2, 2)
copy.ColumnMappings.Add(3, 3)
copy.ColumnMappings.Add(4, 4)
copy.DestinationTableName = "Censis"
copy.WriteToServer(dt)
End Using
End Using
SqlBulkCopy
uses ADO.NET to connect to a database to bulk load the data. I have created an SqlConnection
object, and that object reference is used to create the SqlBulkCopy
object. The DestinationTableName
property references a table in the database where the data is to be loaded. A handy feature of SqlBulkCopy
is the SqlBulkCopyColumnMappingCollection
. Column mappings define the relationships between columns in the data source and columns in the destination. This is handy if the data source file has columns that don’t need to be inserted into the database. Column mappings can be set by an index, such as the example above, or they can be set by the name of the column. Using the index is handy when you’re working with files that contain no column names. Make sure both of your datatable and sqltable columns are in the same order. Finally the data is sent to the database by running the WriteToServer
method.
Hence using sqibulkcopy()
method is a very fastest than any other insertion method.