I was recently in need of importing data from a CSV (Comma Separated File) to a table in SQL Server 2008.
After searching in Google for a little while, I found this blog entry from Pinal Dave (SQL Authority) which always provides good content.
Besides passing along the technique described in Pinal's blog post, I would like to expand by explaining a few snags that I ran into.
My source data in the CSV file looks something like this:
HOU009,Windows Phone 7,Will Martinez,
11/10/2011,Houston; TX,999,2
HOU010,WPF for Business Applications,Will Martinez,
11/15/2011,Houston; TX,695,1
More or less, I have all the data needed to fill the columns of my target table. My assumption is that my column id of type uniqueidentifier
is going to be auto-generated. This is my table:
CREATE TABLE [dbo].[Courses](
[id] [uniqueidentifier] NOT NULL DEFAULT NEWSEQUENTIALID(),
[code] [varchar](50) NOT NULL,
[description] [varchar](200) NULL,
[instructor] [varchar](50) NULL,
[date] [date] NULL,
[venue] [varchar](50) NULL,
[price] [money] NULL,
[duration] [int] NULL,
CONSTRAINT [PK_Courses]
PRIMARY KEY CLUSTERED ([id] ASC,[code] ASC )
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, _
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY]) ON [PRIMARY]
As you may notice, I have two datatypes that may be challenging, first my primary key is of type uniqueidentifier
and also, I'm using a date datatype
.
Initially, I tried to import my data from the CSV file using the below statement:
bulk insert [dbo].[Courses]
from 'C:\Courses.csv'
with (fieldterminator = ',', rowterminator = '\n')
go
On my first try, I did not have any luck. I ran into the below issue:
Msg 4861, Level 16, State 1, Line 1
Cannot bulk load because the file "C:\Courses.csv"
could not be opened. Operating system error code 5
(Access is denied.).
Moved my file to the Public directory and then made sure permissions were set to 'Everyone
'. No luck, although I did not get the "access denied" error. My insert returns 0 row(s) affected.
At this point, my guess was that I had a couple of issues:
- First, I did not have a value for the Id column which is our primary key
- My column of type
date
could be a problem if there is an implicit conversion of string
data
To work around the problem, I decided to create a new temporary table, without a primary key and with a datetime datafield
. Here is the table:
CREATE TABLE [dbo].[CoursesTemp](
[code] [varchar](50) NOT NULL,
[description] [varchar](200) NULL,
[instructor] [varchar](50) NULL,
[date] [datetime] NULL,
[venue] [varchar](50) NULL,
[price] [money] NULL,
[duration] [int] NULL)
Ran my insert
statement again:
bulk insert [dbo].[CoursesTemp]
from 'C:\Users\Public\Downloads\Courses.csv'
with (fieldterminator = ',', rowterminator = '\n')
go
Success! Finally got to see what I was waiting for...(20 row(s) affected).
At this point, life is easy. I can use the data I just inserted in the temporary table and use it to insert into the "live
" table.
insert [dbo].[Courses]
(code, description, instructor, date, venue, duration)
select
code, description, instructor, cast(date as date), venue,
duration
from [dbo].[CoursesTemp]
Notice that my Id column is not listed since it has a uniqueidentifier
with a default of NEWSEQUENTIALID , it automatically generates a GUID
for each record.
To deal with the issue of the date field, I noticed that there is a CAST statement that will convert the data to the appropriate datatype
.
Below is how the data looks like in my table:
?
Now, I have the data I needed and my id column has nice GUID
s generated for every record.
I'm ready to get some work done.
Hope this helps somebody out there,
Will