Introduction
This article shows how to override Schema.ini file on the fly to
facilitate bulk
uploads and also shows how to use SQLXML feature to do bulk
upload into
SqlServer 2000 table.
Background
Ever had issues in your web
application with excel bulk uploads where the jet engine
was not treating
the column values as text?
In this article we
will see how we can overcome this issue and also see a
sample code
for uploading bulk data into a SQL Server 2000 table,
using the SQLXML
features (though there are hundred different ways to do
this same thing!).
Using the code
I get this huge file once every two weeks which I need to upload to
SQL Server
table. The columns have values with leading zeros.
The file with sample data looks something like this:
// sample Provider_file.txt
provider ID,License
0000018, 079376
0000019, 076185
0000021, 034733
0000024, 032921
0000025, 034551
0000043, 079331
0000107, 035911
0000272, 075555
0000295, 074591
0000331, 057518
0000342, 074673
0000371, 079491
Now, I could make use of the OPENROWSET
functionality or DTS to upload this file to the target table.
But, I have
this little file upload interface in my web application that I
use to upload
the file. This would also enable my clients to go online and
upload
data whenever they wish.
What I do in the code is save this file with a
date time stamp on it to make sure that I do not overwrite the
previous
file.
<label>UploadFile</label>
<INPUT id=txtFileToUpload type=file name=txtFileToUpload
runat="server" />
<INPUT id=btnUpLoad onclick=doProcess();
type=button value=Process name=btnUpLoad
Runat="server" />
Then read the file contents into a data set.
Dim cnn As OleDb.OleDbConnection
Dim dsCSV As New DataSet("Providers") cnn =
New OleDb.OleDbConnection("Provider= Microsoft.Jet.OLEDB.4.0;Data
Source=" + filePath + ";Extended Properties=Text")
Dim mySelect As String = String.Empty
mySelect = "Select * from [" & fileName & "]"
Dim da As New OleDb.OleDbDataAdapter(mySelect, cnn)
da.Fill(dsCSV)
Now, while selecting the data into a dataset I observed
that the column having data with leading zeros was not treated
like a text
column type, despite changing the column type to Text in excel
and then
converting the file to a CSV file. This obviously means that when
data is
read into a dataset, the leading zeros would vanish. Then I
figured
out that I needed to put the Schema.ini file with column
data types into the
folder containing the data file, in order for the jet engine know
the data
types of the data columns specifically. Otherwise Jet engine
would interpret
the column as numeric based on the first few rows of data.
Schema.ini override
[PROVIDER_FILE_02232007090418.txt]
ColNameHeader = True
Format = CSVDelimited
CharacterSet=ANSI
Col1=IfProv Char Width 20
Col2=License Char Width 20
The first line of the Schema.ini file contains the
target
file name including the file extension on which the schema needs
to be applied.
In the second line, mentions whether the column headers are
present in the
target file or not. The third line contains the format of the
target file.
Fourth line contains the character set. After the fourth line,
you can start
declaring the column names and their data types. Note that the
Schema.ini file
should be present in the same folder as the target data file you
are trying to
create.
In my application, since the upload happens biweekly, and
since I needed the previously uploaded file to be untouched, I
had to
come up with a way to change the file name (first line in the
schema file) in
the Schema.ini file. So I decided to create the
Schema.ini file on the fly
every time the upload happens by feeding in the first line. And
the code for
that would look something like this:
fileStream = New FileStream(fileName,
FileMode.Truncate)
writer = New StreamWriter(fileStream)
writer.WriteLine("[" & UploadfileName & "]")
writer.WriteLine("ColNameHeader = True")
writer.WriteLine("Format = CSVDelimited")
writer.WriteLine("CharacterSet=ANSI")
writer.WriteLine("Col1=IfProv Char Width 20")
writer.WriteLine("Col2=License Char Width 20")
Once the data is read to a dataset, rename table and the
column names (if need be) before getting the XML output.
If Not dsCSV Is Nothing Then
dsCSV.Tables(0).TableName = "Provider"
dsCSV.Tables(0).Columns(0).ColumnName = "IfProv"
dsCSV.Tables(0).Columns(1).ColumnName = "License"
End If
And the XML data would look something like this after renaming the
table names
and columns:
<Providers>
<Provider>
<License>79376</License>
<ifprov>0000018</ifprov>
</Provider>
<Provider>
<License>76185</License>
<ifprov>0000019</ifprov>
</Provider>
</Providers>
Send the XML data to Stored Procedure
Pass on the XML to the stored proc to process.
SqlHelper.ExecuteNonQuery(ConnectionString, _
CommandType.StoredProcedure, _
"BULK_UPLOAD_XML_DATA", _
New SqlParameter() { _
New SqlParameter("@PROVDATA", xmlProviderData),
returnParameter _
})
Uploading data using SQLXML
Create a table with two columns as shown below.
CREATE TABLE TBL_PROVIDER(IfProv VARCHAR(20)
NULL,License VARCHAR(20) NULL)
The stored procedure to upload the XML data in to the table using
SQLXML
feature would be something like this:
CREATE PROC BULK_UPLOAD_PROVIDER_DATA
@PROVDATA
NTEXT
AS
EXEC SP_XML_PREPAREDOCUMENT @IDOC OUTPUT,@PROVDATA
INSERT INTO
TBL_PROVIDER (IfProv,License)
SELECT * FROM OPENXML
(@IDOC, '/Providers/Provider') WITH (IfProv
VARCHAR(10) 'IfProv', License
VARCHAR(10) 'License') WHERE IfProv IS NOT NULL AND License IS NOT _
NULL
EXEC SP_XML_REMOVEDOCUMENT @IDOC
Wrapping it all up
I have always enjoyed using SQLXML features of SQL Server and I
hope that the
Schema.ini override along with this SQLXML features in
this article help a lot of people
by resolving their day-to-day bulk upload issues.
Disclaimer: The source code contains Microsoft's SQLHelper helper
class for making
database calls which I use in most of my applications (I'm sure
many readers use
this!) and it is not written by me. I included it in the source
code zip file just
to make sure that the code works after downloading it.