Introduction
This is a utility I use (converted from VB) to move data to and from CSV files and data tables. Hopefully, this will answer all those questions about moving CSV data to and from data tables.
Background
A lot of my work involves moving large amounts of data from legacy and core data systems into analysis and reporting systems. This is just one of the utilities I have developed over the years to meet those requirements.
This version does not support text delimiters (commas embedded in the data will break the split method).
I have kept clsFileHandler as a separate project as we include the file handler DLL in our projects. I assume that it will be included in a utilities project.
The sample CSV file is from the AdventureWorks contact table.
Using the code
GetDetails
This instantiates clsFileHandler
using the file path from txtFileName
. The constructor creates a new IO.FileInfo
object and exposes it in the FileInf
property. You can also pass in a populated FileInfo
object. I also added the following properties as they are not in the FileInfo
object for some reason.
UNCPath
If the drive in the file name is a network drive, the UNC path is exposed; if it is a local drive the UNC path is blank.
NameOnly
This exposes the file name without the extension. As I map file names to the data table when using BCP or BulkCopy and the table does not have an extension, this is a useful property.
CSVToTable
TitleRowNo
: This indicates if there is to be a title row used for the table column headers. If there is no title, then leave the setting at -1.FirstDataRow
: This setting is to cater for when there is blank line(s) in the beginning of the file. It has the flexibility to support blank lines between the title row and the data.FieldDelimiter
: defaults to a comma.MaxRows
: defaults to 0 which will read all rows; setting this to a positive number limits the number of rows read. This is useful when test reading a large file.
GetData
This calls the method CSVtoTable
. The problem here is to remember to set the data row and header row properties, becuase if these are incorrectly left to the defaults, they can really screw up your data.
{
oFH.Delimiter = txtDelimiter.Text;
oFH.DataRow1 = Convert.ToInt32(numDataRow.Value);
oFH.HeaderRow = Convert.ToInt32(numTitleRow.Value);
oFH.MaxRows = (int)numMax.Value;
dtData = oFH.CSVToTable();
DataGridView1.DataSource = dtData;
}
Results where the title and first data row are set incorrectly:
After the correct settings are applied:
In previous versions, the settings have been passed in as parameters, and I'm not convinced that it is not a better solution.
CSVtoTable
This method does the following
- Checks the settings and parameters.
- Reads either the title row or the data row if there is no title.
- Cleanses the text line of single quotes and slashes.
- Uses the first row to create a table and name the columns based on the title or "ColName01" etc.; duplicate or blank named columns are dealt with in the same manner.
- Reads each row and loads it into the data table.
The method returns a populated data table.
TableToCSV
This method simply uses a stream writer to output the data to the file name specified. The usee can set the destination, the delimiter to use, and whether to include titles.
Points of interest
This utility needs to be extended to deal with text delimiters by reading each character in each line to determine the content. Another enhancement will be to export a list view to a CSV file.
History
First draft.