Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Using OleDb to Import Text Files (tab, CSV, custom)

0.00/5 (No votes)
15 Jul 2008 1  
A simple class to help you get started with the OleDb Jet Engine to import text files

Introduction

I have been browsing the Web for a good and simple class to handle delimited file imports. My current assignment has an import option that needs to deal with that. However, the current implementation (using StreamReader) is not good enough. It doesn't handle all the exceptions you encounter with delimited files. I found a number of examples on the Internet, but none of them really suited my needs. What I really missed was a simple example that I could extend so that it would suit my needs. So, being the developer that I am, I created my own class to import delimited files. After this was completed, I though I'd share it with others as an example.

Using StreamReader

The easiest way to process delimited files is to use a StreamReader object. You then simply open the file, read each line and then use the split method to get the various column values. For example:

public void ImportDelimitedFile(string filename, string delimiter)
{
    using (StreamReader file = new StreamReader(filename))
    {
        string line;

        while ((line = file.ReadLine()) != null)
        {
            if (line.Trim().Length > 0)
            {
                string[] columns = line.Split(delimiter, StringSplitOptions.None);
         
                // Add code to process the columns
            }
        }
    }
}

In a lot of cases this works just fine, but there are limitations to this scenario:

  • It's difficult to split a line into columns. For example, when you use Comma Separated File (CSV) it is well possible that a comma is in one of the columns. Using a simple string.Split is therefore not an option.
  • When you only need certain columns or lines, you will need to scan all of them and handle all lines and filter what you need.
  • It's not possible to return to a previous line.

Using the Jet Engine

The above mentioned problems are eliminated when you use the Jet engine. The following code shows how a CSV file can be processed:

public void ImportCsvFile(string filename)
{
    FileInfo file = new FileInfo(filename);

    using (OleDbConnection con = 
            new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" +
            file.DirectoryName + "\";
            Extended Properties='text;HDR=Yes;FMT=Delimited(,)';"))
    {
        using (OleDbCommand cmd = new OleDbCommand(string.Format
                                  ("SELECT * FROM [{0}]", file.Name), con))
        {
            con.Open();
 
            // Using a DataReader to process the data
            using (OleDbDataReader reader = cmd.ExecuteReader())
            {
                while (reader.Read())
                {
                    // Process the current reader entry...
                }
            }

            // Using a DataTable to process the data
            using (OleDbDataAdapter adp = new OleDbDataAdapter(cmd))
            {
                DataTable tbl = new DataTable("MyTable");
                adp.Fill(tbl);

                foreach (DataRow row in tbl.Rows)
                {
                    // Process the current row...
                }
            }
        }
    }
} 

As you can see in the example, once you have the Command object, you have the option of using anything a command object will allow you to do. You could process the file using a DataReader object, create a DataTable object containing the data or even add a where clause to the CommandText of your Command object to specify better which data is to be imported.

Helper Class

Using this, and the information provided in this Microsoft article, I created a small class that allows you to import delimited files. The class is very basic, but can easily be extended to suit your specific needs. This class will solve the most important issues when you're going to use Jet as your import engine.

Listed below are some things you need to consider when you are importing delimited files, be it with this class or using custom code:

  • The Jet engine makes assumptions about the content of the file. This can result in incorrect imports. For example, it might think a column contains date values. But in fact, your file should treat the columns as a string. In these cases, you should create a Schema.Ini file that describes the type of value for each column. The class creates a Schema.Ini file before it opens the delimited file, but only to specify what the delimiter is. You may want to change this to use pre-defined INI files that describe your input file. Details on the Schema.Ini file can be found here.
  • The class uses an OleDbDataReader to read each line in the import file. But it is easily replaced with the option of adding the data into a DataSet or DataTable object. It's also possible to use SqlBulkCopy to instantly insert all the data into a SQL server database.
  • The above mentioned Microsoft article is the best starting point for this type of import. You might want to read that before you start building imports for delimited files, even if this class is useful to you. The article provides interesting background information and links to various Microsoft resources with more details and information.
  • The helper class uses an event to allow you to handle the information being read. You can, of course, also provide an overridable method.

Valuable Resources

The information I used to build this class was found on the Internet. I used the following resources:

Disclaimer

The code presented in the helper class is not an all-purpose import solution. It's just a basic class to help you build your own import class. If you need other import types, or a way to influence the content of the default Schema.ini file, you will need to do that yourself. If you find any problems, please feel free to point them out to me.

History

  • 15th July, 2008: Initial post

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here