Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Bulk Data Management

0.00/5 (No votes)
28 Nov 2012 1  
This is the best way to handle or manage bulk data insert to sql server taken from my technical blog http://www.srinetinfo.com/2012/11/bulk-data-management.html

Introduction

Microsoft .NET has very good functionality to implement. It provides all sorts of functionality to achieve our feature needs. Only thing is we need to know which namespace and which class we should use for it. One of them and very useful class provided by Microsoft is SqlBulkCopy it is under namespace System.Data.SqlClient.

This class is very useful in transferring data from application to SQL Server. This is one of the best practices to be practiced by each .net developer. 

Background

Its been 4 years when I was giving interview to ValueLabs, the interviewer asked me a simple question.

Q. How are you inserting data to tables?
A. I replied by using a stored procedure.
    
Q. If you have some ten records to be inserted at once what you will do?
A. I will prepare an XML and send it to stored procedure. inside sp I will write code to handle XML.

Q. What if you are given a Excel file to insert some 1000's of records to insert. preparing xml and sending will take lots of time even your varchar parameter type may not support some time how you will do in that case? 
A. XML is the only way I am using so far to insert set of records. That way it works and it never gave me any error.

Q. What if we want to update a set of records, does your xml scenario work?
A. I have no answer, as I never faced such situation. So I was quite. Then he replied and explained me have you ever heard of SqlBulkCopy? Which will insert any number of records in fractions of seconds? Then I was amazed and happy too know this. I was not through that interview but I knew a new concept. So I implemented same in my company and got very good appreciation from higher officials. Now I would like to share my experience and the SqlBulkCopy class usage with you. 

SqlBulkCopy - Differenet Scenarios

Scenario I: Contacts File Upload

In your applications you may get a requirement like upload number of contacts that are in a EXCEL, CSV, or XML file to a database table. Which is most common and repeating work and every developer in his life time will do this kind applications.

Scenario-II: Producer-Consumer Problem

This is also some application problem where the requirement is like all producers (users or services) send message or requests to server queue. The server queue is periodically cleaned by processing or parsing each message and inserting them into database table.

Scenario-III: Inserting Multiple Records From UI

Consider you are given a requirement for adding more than one customer record in a page. Provided multiple rows to add multiple customer in a single page. what will you do int this scenario. Do you insert one by one records which requires lot of db calls and will reduce the application performance. There are many scenarios like this to discuss. For all these things to insert more than one record into db table we require a mechanism to handle it very efficiently and with best optimized performance. The .NET Framework has a very good class to handle this. Now we will discuss the same. in this article. you can refer SqlBulkCopy class here.

Using the code 

Let's start with an example. Just insert multiple rows into database table in 3 different ways and examine the results

Software Requirements

Visual Studio 2010 / 2012

SQL Serer 2008 R2 / 2012

Create Customers table on database by using the following script

USE [BlogSamples] 
GO
/****** Object:  Table [dbo].[Customers]    Script Date: 11/17/2012 1:07:38 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Customers](
 [ID] [bigint] IDENTITY(1,1) NOT NULL,
 [FirstName] [varchar](50) NOT NULL,
 [LastName] [varchar](50) NULL,
 [Address1] [varchar](50) NOT NULL,
 [Address2] [varchar](50) NULL,
 [City] [varchar](50) NOT NULL,
 [State] [varchar](50) NOT NULL,
 [Country] [varchar](50) NOT NULL,
 [ZipCode] [varchar](50) NULL,
 CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED 
(
 [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
   IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
   ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO

Create a Console application on visual studio as shown in the figure 

Bulk Data Management Application

Now we will insert the rows into the table from three different ways and will examine the performance

Multiple Rows Multiple DB Calls:

Create a stored procedure to insert individual rows into the database

       -- ================================================ 
-- Template generated from Template Explorer using:
-- Create Procedure (New Menu).SQL
--
-- Use the Specify Values for Template Parameters 
-- command (Ctrl-Shift-M) to fill in the parameter 
-- values below.
--
-- This block of comments will not be included in
-- the definition of the procedure.
-- ================================================
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:  <Author,,Name>
-- Create date: <Create Date,,>
-- Description: <Description,,>
-- =============================================
CREATE PROCEDURE Customers_Insert 
 -- Add the parameters for the stored procedure here
 @firstname varchar(50),
 @lastname varchar(50),
 @address1 varchar(50),
 @address2 varchar(50),
 @city varchar(50),
 @state varchar(50),
 @country varchar(50),
 @zipcode varchar(50)
AS
BEGIN
 -- SET NOCOUNT ON added to prevent extra result sets from
 -- interfering with SELECT statements.
 SET NOCOUNT ON;
    INSERT INTO [dbo].[Customers]
           ([FirstName]
           ,[LastName]
           ,[Address1]
           ,[Address2]
           ,[City]
           ,[State]
           ,[Country]
           ,[ZipCode])
     VALUES
           (@firstname,@lastname,@address1,@address2,@city,@state,@country,@zipcode)
END
GO

Create a new class to handle all db calls and name it as BulkDataManagement.cs. Add the following code to insert one by one row.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data.SqlClient;
namespace BulkDataManagement
{
    public class BulkDataManagement
    {
        public BulkDataManagement()
        {
            //strConnectionString=@"Data Source=WLW7-SRINIVASUP\PEMMA;
            //Initial Catalog=Pagination;User ID=sa;Password=srijailu123*";
        }
        public static void AddCustomer(string strConnectionString, string firstname, 
          string lastname, string address1, string address2, 
          string city, string state, string country, string zipcode)
        {
            using (SqlConnection con = new SqlConnection(strConnectionString))
            {
                con.Open();
                SqlCommand cmd = new SqlCommand("Customers_Insert", con);
                cmd.CommandType = System.Data.CommandType.StoredProcedure;
                cmd.Parameters.Add(new SqlParameter("@firstname", firstname));
                cmd.Parameters.Add(new SqlParameter("@lastname", lastname));
                cmd.Parameters.Add(new SqlParameter("@address1", address1));
                cmd.Parameters.Add(new SqlParameter("@address2", address2));
                cmd.Parameters.Add(new SqlParameter("@city", city));
                cmd.Parameters.Add(new SqlParameter("@state", state));
                cmd.Parameters.Add(new SqlParameter("@country", country));
                cmd.Parameters.Add(new SqlParameter("@zipcode", zipcode));
                cmd.ExecuteNonQuery();
            }
        }
    }
}

Now prepare you program.cs file to input the values to the application as shown below.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Diagnostics;
namespace BulkDataManagement
{
    class Program
    {
        static string connectionString = @"Data Source=WLW7-SRINIVASUP\" + 
          @"PEMMA;Initial Catalog=BlogSamples;User ID=sa;Password=srijailu123*";
        static void Main(string[] args)
        {
            Console.WriteLine("Enter Number of Records to be Inserted: ");
            int noofrecords = int.Parse(Console.ReadLine());
            Stopwatch sw = new Stopwatch();
            sw.Start();
            for (int i = 0; i < noofrecords; i++)
                BulkDataManagement.AddCustomer(connectionString, "First Name - " + i,
                    "Last Name - " + i, "Address1 - " + i, 
                    "Address2 - " + i, "City - I" + i, 
                    "State - " + i, "Country - " + i, "Zip Code - " + i);
            sw.Stop();
            Console.WriteLine("Total elapsed time in milli seconds : " + 
              sw.ElapsedMilliseconds.ToString());            Console.ReadLine();
        }
    }
}

Now run the application and input your number, lets assume 10000 and measure the application performance how long it takes will be noted in the application. It gave me almost 13 seconds duration as in below figure.

Bulk Data Management with 10000 records

Fig. 10000 Records

Now run it for 100,000 records and measure the performance as in the previous step. And the results are showing 129.23 Seconds to insert those many records.

Bulk Data Management With 100000 records

Fig. 100000 records

Multiple Records With Single XML data

Now we will try with another approach where we can insert data by means of preparing an xml and sending that to database Stored Procedure.

The XML can be in the following format.

Bulk Data Management Customer XML Format

Fig. XML Format

Now modiy your code and add new stored procedure as per the below

Stored Procedure 

-- ================================================
-- Template generated from Template Explorer using:
-- Create Procedure (New Menu).SQL
-- 
-- Use the Specify Values for Template Parameters 
-- command (Ctrl-Shift-M) to fill in the parameter 
-- values below.
--
-- This block of comments will not be included in
-- the definition of the procedure.
-- ================================================
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:  <Author,,Name>
-- Create date: <Create Date,,>
-- Description: <Description,,>
-- =============================================
CREATE PROCEDURE Customer_Insert_XML 
 -- Add the parameters for the stored procedure here
 @customerxml varchar(max)
AS
BEGIN
 -- SET NOCOUNT ON added to prevent extra result sets from
 -- interfering with SELECT statements.
 SET NOCOUNT ON;
 DECLARE @idoc int
    EXEC sp_xml_preparedocument @idoc OUTPUT, @customerxml
 Insert into Customers(FirstName,LastName,Address1,Address2,City,State,Country,ZipCode) 
 select * from OPENXML (@idoc,'/Root/Customer') WITH
 (
  firstname varchar(50),
  lastname varchar(50),
  address1 varchar(50),
  address2 varchar(50),
  city varchar(50),
  state varchar(50),
  country varchar(50),
  zipcode varchar(50)
 )
END
GO

BulkDataManagement.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data.SqlClient;
namespace BulkDataManagement
{
    public class BulkDataManagement
    {
        public BulkDataManagement()
        {
            //strConnectionString=@"Data Source=WLW7-SRINIVASUP\PEMMA;Initial Catalog=Pagination;User ID=sa;Password=srijailu123*";
        }
        public static void AddCustomer(string strConnectionString, 
               string firstname, string lastname, string address1, 
               string address2, string city, string state, string country, string zipcode)
        {
            using (SqlConnection con = new SqlConnection(strConnectionString))
            {
                con.Open();
                SqlCommand cmd = new SqlCommand("Customers_Insert", con);
                cmd.CommandType = System.Data.CommandType.StoredProcedure;
                cmd.Parameters.Add(new SqlParameter("@firstname", firstname));
                cmd.Parameters.Add(new SqlParameter("@lastname", lastname));
                cmd.Parameters.Add(new SqlParameter("@address1", address1));
                cmd.Parameters.Add(new SqlParameter("@address2", address2));
                cmd.Parameters.Add(new SqlParameter("@city", city));
                cmd.Parameters.Add(new SqlParameter("@state", state));
                cmd.Parameters.Add(new SqlParameter("@country", country));
                cmd.Parameters.Add(new SqlParameter("@zipcode", zipcode));
                cmd.ExecuteNonQuery();
            }
        }
        public static void AddCustomerWithXML(string strConnectionString, string customerxml)
        {
            using (SqlConnection con = new SqlConnection(strConnectionString))
            {
                con.Open();
                SqlCommand cmd = new SqlCommand("Customer_Insert_XML", con);
                cmd.CommandType = System.Data.CommandType.StoredProcedure;
                SqlParameter param = new SqlParameter("@customerxml", System.Data.SqlDbType.VarChar);
                param.Value = customerxml;
                cmd.Parameters.Add(param);
                cmd.ExecuteNonQuery();
            }
        }
    }
}

Program.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace BulkDataManagement
{
    class Program
    {
        static string connectionString = 
          @"Data Source=WLW7-SRINIVASUP\PEMMA;Initial Catalog" + 
          @"=BlogSamples;User ID=sa;Password=srijailu123*";
        static void Main(string[] args)
        {
            Console.WriteLine("Enter Number of Records to be Inserted: ");
            int noofrecords = int.Parse(Console.ReadLine());
            Console.WriteLine("Enter Which Method to Execute /nNormal Insert->1/nXML Insert->2/nSqlBulkCopy->3");
            int choice = int.Parse(Console.ReadLine());
            Stopwatch sw = new Stopwatch();
            sw.Start();
            if (choice == 1)
                Customer_Insert(noofrecords);
            else if (choice == 2)
                Customer_XML_Insert(noofrecords);
           sw.Stop();
            Console.WriteLine("Total elapsed time in milli seconds : " + sw.ElapsedMilliseconds.ToString());
            Console.ReadLine();
        }
        static void Customer_Insert(int num)
        {
            for (int i = 0; i < num; i++)
                BulkDataManagement.AddCustomer(connectionString, "First Name - " + i,
                    "Last Name - " + i, "Address1 - " + i, 
                    "Address2 - " + i, "City - I" + i, 
                    "State - " + i, "Country - " + i, "Zip Code - " + i);
        }
        static void Customer_XML_Insert(int num)
        {
            StringBuilder sb = new StringBuilder("<Root>");
            for (int i = 0; i < num; i++)
            {
                sb.Append("<Customer firstname='First Name - XML -" + i + "' ");
                sb.Append("lastname='Last Name -  XML -" + i + "' ");
                sb.Append("address1='Address1 -  XML -" + i + "' ");
                sb.Append("address2='Address2 -  XML -" + i + "' ");
                sb.Append("city='City -  XML -" + i + "' ");
                sb.Append("state='State -  XML -" + i + "' ");
                sb.Append("country='Country -  XML -" + i + "' ");
                sb.Append("zipcode='Zipcode -  XML -" + i + "' />");
            }
            sb.Append("</Root>");
            BulkDataManagement.AddCustomerWithXML(connectionString, sb.ToString());
        }
        static void Customer_Insert_With_SqlBulkCopy(int num)
        {
        }
    }
}

Here I added the choice section so that while running you can choose appropriate and note down the timings

1-> Normal Insert

2-> XML Insert

3-> Using SqlBulkCopy

Now execute the application and note down the results for XML Insert as we did in the previous section. You check the results as fast as it inserted the rows into the database table.

Bulk Data Management with XML for 10000 Records

Fig. XML 10000 Records

Now execute the same for 100,000 records. Check the application performance how fast it is and how cute the application. 100000 records also in 15 seconds amazing performance.

Bulk Data Management With XML 100000 Records

Fig. XML 100000 records

What an amazing performance with XML.

Bulk Data with SqlBulkCopy Class

Now update your BulkDataManagement.cs code as below. here we don't need any stored procedure. We can execute the commands directly.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data.SqlClient;
using System.Data;
namespace BulkDataManagement
{
    public class BulkDataManagement
    {
        public BulkDataManagement()
        {
            //strConnectionString=@"Data Source=WLW7-SRINIVASUP\PEMMA;Initial Catalog=Pagination;User ID=sa;Password=srijailu123*";
        }
        public static void AddCustomer(string strConnectionString, 
          string firstname, string lastname, string address1, string address2, 
          string city, string state, string country, string zipcode)
        {
            using (SqlConnection con = new SqlConnection(strConnectionString))
            {
                con.Open();
                SqlCommand cmd = new SqlCommand("Customers_Insert", con);
                cmd.CommandType = System.Data.CommandType.StoredProcedure;
                cmd.Parameters.Add(new SqlParameter("@firstname", firstname));
                cmd.Parameters.Add(new SqlParameter("@lastname", lastname));
                cmd.Parameters.Add(new SqlParameter("@address1", address1));
                cmd.Parameters.Add(new SqlParameter("@address2", address2));
                cmd.Parameters.Add(new SqlParameter("@city", city));
                cmd.Parameters.Add(new SqlParameter("@state", state));
                cmd.Parameters.Add(new SqlParameter("@country", country));
                cmd.Parameters.Add(new SqlParameter("@zipcode", zipcode));
                cmd.ExecuteNonQuery();
            }
        }
        public static void AddCustomerWithXML(string strConnectionString, string customerxml)
        {
            using (SqlConnection con = new SqlConnection(strConnectionString))
            {
                con.Open();
                SqlCommand cmd = new SqlCommand("Customer_Insert_XML", con);
                cmd.CommandType = System.Data.CommandType.StoredProcedure;
                SqlParameter param = new SqlParameter("@customerxml", System.Data.SqlDbType.VarChar);
                param.Value = customerxml;
                cmd.Parameters.Add(param);
                cmd.ExecuteNonQuery();
            }
        }
        public static void CustomersBulkCopy(string connectionstring, DataTable dtCustomers)
        {
            using (SqlBulkCopy scopy = new SqlBulkCopy(connectionstring, SqlBulkCopyOptions.TableLock))
            {
                scopy.DestinationTableName = "dbo.Customers";
                scopy.WriteToServer(dtCustomers);
            }
        }
    }
}

And change your program.cs file code as below.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data;
namespace BulkDataManagement
{
    class Program
    {
        static string connectionString = 
           @"Data Source=WLW7-SRINIVASUP\PEMMA;Initial Catalog=BlogSamples;User ID=sa;Password=srijailu123*";
        static void Main(string[] args)
        {
            Console.WriteLine("Enter Number of Records to be Inserted: ");
            int noofrecords = int.Parse(Console.ReadLine());
            Console.WriteLine("Enter Which Method to Execute /nNormal Insert->1/nXML Insert->2/nSqlBulkCopy->3");
            int choice = int.Parse(Console.ReadLine());
            Stopwatch sw = new Stopwatch();
            sw.Start();
            if (choice == 1)
                Customer_Insert(noofrecords);
            else if (choice == 2)
                Customer_XML_Insert(noofrecords);
            else if (choice == 3)
                Customer_Insert_With_SqlBulkCopy(noofrecords);
            sw.Stop();
            Console.WriteLine("Total elapsed time in milli seconds : " + sw.ElapsedMilliseconds.ToString());
            Console.ReadLine();
        }
        static void Customer_Insert(int num)
        {
            for (int i = 0; i < num; i++)
                BulkDataManagement.AddCustomer(connectionString, "First Name - " + i,
                    "Last Name - " + i, "Address1 - " + i, "Address2 - " + i, 
                    "City - I" + i, "State - " + i, "Country - " + i, "Zip Code - " + i);
        }
        static void Customer_XML_Insert(int num)
        {
            StringBuilder sb = new StringBuilder("<Root>");
            for (int i = 0; i < num; i++)
            {
                sb.Append("<Customer firstname='First Name - XML -" + i + "' ");
                sb.Append("lastname='Last Name -  XML -" + i + "' ");
                sb.Append("address1='Address1 -  XML -" + i + "' ");
                sb.Append("address2='Address2 -  XML -" + i + "' ");
                sb.Append("city='City -  XML -" + i + "' ");
                sb.Append("state='State -  XML -" + i + "' ");
                sb.Append("country='Country -  XML -" + i + "' ");
                sb.Append("zipcode='Zipcode -  XML -" + i + "' />");
            }
            sb.Append("</Root>");
            BulkDataManagement.AddCustomerWithXML(connectionString, sb.ToString());
        }
        static void Customer_Insert_With_SqlBulkCopy(int num)
        {
            DataTable dtCustomers = new DataTable("Customers");
            DataColumn dc = new DataColumn("FirstName");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            dc = new DataColumn("LastName");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            dc = new DataColumn("Address1");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            dc = new DataColumn("Address2");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            dc = new DataColumn("City");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            dc = new DataColumn("State");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            dc = new DataColumn("Country");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            dc = new DataColumn("ZipCode");
            dc.DataType = System.Type.GetType("System.String");
            dtCustomers.Columns.Add(dc);
            for (int i = 0; i < num; i++)
            {
                DataRow dr = dtCustomers.NewRow();
                dr["FirstName"] = "First Name sc - " + i;
                dr["LastName"] = "Last Name sc - " + i;
                dr["Address1"] = "Address1 sc - " + i;
                dr["Address2"] = "Address2 sc - " + i;
                dr["City"] = "City sc - " + i;
                dr["State"] = "State sc - " + i;
                dr["Country"] = "Country sc - " + i;
                dr["ZipCode"] = "ZipCode sc - " + i;
                dtCustomers.Rows.Add(dr);
            }
            BulkDataManagement.CustomersBulkCopy(connectionString, dtCustomers);
        }
    }
}

Now as we did in last 2 sections execute the application for sqlbulkcopy mode and check the application performance.

Wow its really amazing and the data 10000 records inserted time seems to be pretty fine and we achieved the best among these three approaches.

Bulk Data Management with SqlBulkCopy 10000 records

Fig. SqlBulkCopy 10000 Records

Check with 100,000 records now. The application performance is pretty amazing and the most efficient solution we have built now. See the performance which can insert 100000 records just in 2 seconds. This is what we are trying to prove till now

 

Conclusion:

     As you can see from the above experiment all the three method did well though SqlBulkCopy is the best among them in dealing with huge datasets. Below is the simple comparison from my experiments.

Comparison for Normal, XML and SqlBulkCopy bulk data management 

 

Points of Interest  

Here we can learn how to insert bulk data into database with the best optimized performance. The ways of handling bulk data in different scenarios.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here