Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

SQLXAgent - Jobs for SQL Express - Part 3 of 6

5.00/5 (2 votes)
30 Sep 2017CPOL10 min read 8.8K   29  
Create and run jobs kinda like SQL Server Enterprise - The CSV and Excel Importer Code

Part 1 - Using SQLXAgent
Part 2 - Architecture and Design Decisions
Part 3 - The CSV and Excel Importer Code (this article)
Part 4 - Job Scheduling Code
Part 5 - How Packages Are Run
Part 6 - Interesting Coding

Introduction

This article is the third in a larger series of articles describing the SQLXAgent utility, specifically discussing the code that handles importing of Excel and CSV files. There won't be very many screen shots - just a bunch of code snippets and boring narrative. If you don't care about the subject matter of this article part, feel free to ignore it and go find something else to do. Other coding challenges are discussed in part 4 of this article series.

Note: - The only differences between the code in the files and the code in this article is related to the elimination (via line formatting) of line-length-induced horizontal scroll bars.

Throughout this code, you may notice calls to unfamiliar methods (for example, string.ReplaceNonAlphaNumeric()). I make heavy use of extension methods, and that's probably what you're seeing. If you're really curious, you can find them in the ObjectExtensions folder within the SQLXCommon project.

NOTE: Code snippets presented in the article may or may not reflect the absolutely latest and greatest version of the code. In the event that it does not exactly match the actual code, it will fairly close. Such is the nature of noticing issues while writing an article.

General Overview

The classes described here are located in the SQLXPkgBase assembly. The class inheritance and dependency hierarchy is illustrated below.

Importing class hierarchy

I wanted to bring up the fact that the CSV importer is loosely based on the article I wrote in 2016 - CSV File Parser . I say "loosely based" because writing this code forced me to re-evaluate the way I approached making the imported data accessible to the parent application/assembly.

The Code

In this article we'll be discussing the classes that are primarily involved in the importation of data. They are:

  • ImporterBase - The absolute (abstract) base class for CSV and Excel importer classes.
     
  • CSVImportBase - The (abstract) base class for CSV importer classes
     
  • ExcelImportBase - The (abstract) base class for Excel importer classes
     
  • HintItem and HintItemList - describes imported columns
     

ImporterBase

This is an abstract class that contains most of the properties and functionality for every CSV or Excel importer. Most of the methods are protected so they can be used by the inheriting classes. The inheriting classes are responsible for the actual loading of the raw data, and creating a DataTable object on which this class performs its processing. Once the desired file is loaded, the inheriting class calls the ProcessData() method.

C#
/// <summary>
/// Calls all of the methods involved in processing data loaded from the file.
/// All of the methods called here operate on a DataTable.
///
</summary>
protected virtual void ProcessData()
{
    // remove empty rows (if configured to do so)
    this.RemoveEmptyRows();
    // remove linefeeds and spaces from column names
    this.NormalizeNames();
    // Create column hints list if necessary. If you want to use your own,
    // instantiate it in your derived class BEFORE calling the ProcessData()
    // method.
    this.CreateColumnHints();
    // determine the max width of each column
    this.EstablishStringLengths();
    // builds a default database prep query (which can be overridden from the
    // derived class)
    this.BuildPrepQuery();
    // build the query that creates the table (which can be overridden from the
    // derived class)
    this.BuildCreateTableQuery();
    // build the query the inserts the data (which can be overridden from the
    // derived class)
    this.BuildInsertQuery();
}

The methods are described below. I retained the comments in an (probably futile) attempt to keep the narrative text to a minimum.

  • RemoveEmptyRows() - If the DeleteEmptyRows property is true, any row in the data table object that have all empty/null fields is removed from the data table.
    C#
    /// <summary>
    /// If DeleteEmptyRows property is true, removes empty rows (all columns = DBNull) 
    /// from the loaded excel data.
    /// </summary>
    protected virtual void RemoveEmptyRows()
    {
        this.RemovedRowCount = 0;
        if (this.DeleteEmptyRows)
        {
            for (int i = this.ImportedData.Rows.Count-1; i >= 0; i--)
            {
                DataRow row = this.ImportedData.Rows[i];
                if (row.ItemArray.All(x=>x is DBNull || string.IsNullOrEmpty(x.ToString())))
                {
                    this.ImportedData.Rows.Remove(row);
                    this.RemovedRowCount++;
                }
            }
        }
    }
  • NormalizeNames() - This method will normalize column names and the table name. If there is no header row in the source data file, columns will be automatically created using the "Column_n" format, where "n" is the ordinal position of the column within the source data. If the table name was not specifically specified by the programmer, the name of the table into which the data is imported is determined by the file name.
    C#
    /// <summary>
    /// If the first row is a header row, this method changes ExcelData.Columns to the 
    /// contents of the first row of the table, and the first row is deleted. The 
    /// column names (and table name) are massaged to replace all non-alphanumeric 
    /// characters with underscores, as well as trimming leading/trailing spaces.
    /// </summary>
    /// <param name="dt"></param>
    protected virtual void NormalizeNames()
    {
        // The ExcelDataReader object does not appear to have a property we can use to 
        // determine if the first row is a header, so we use our own property (set by 
        // the programmer).
        // If the first row is the header, pull the names from the first row, and 
        // delete the row.
        if (this.FirstRowIsHeader)
        {
            // first row
            DataRow headerRow = this.ImportedData.Rows[0];
            // processe each column
            for (int i = 0; i < headerRow.ItemArray.Count(); i++)
            {
                // to ease typing
                DataColumn excelColumn = this.ImportedData.Columns[i];
                string rowColumn = (string)headerRow.ItemArray[i];
                // Set the column name from the first row of the data
                                            // if the column in the row is null/empty
                excelColumn.ColumnName = (string.IsNullOrEmpty(rowColumn) 
                                            // keep the name we already have
                                            ? excelColumn.ColumnName 
                                            // otherwise set the excel column to whatever 
                                            // the row says it should be
                                            : rowColumn
                                            // trim leading/trailing spaces, and 
                                            // replace linefeeds and embedded spaces 
                                            // with underscores
                                            ).Trim().ReplaceNonAlphaNumeric('_');
            }
            // delete the header row
            this.ImportedData.Rows.RemoveAt(0);
        }
        // We use either the specified ImportedTableName, or we build the table name 
        // based on the filename and (if this is an excel file) sheet name. I try to 
        // avoid nested ternary conditions, but sometimes, it just makes sense to use 
        // them. In these situations, code formatting goes a long way toward assisting 
        // a programmer who is not familiar with the code base.
        string newTableName = (string.IsNullOrEmpty(this.ImportedTableName)) 
                                ? string.Concat("Imported_", 
                                                Path.GetFileName(this.FileName).Trim(), 
                                                ((this.FileType == ImportFileType.EXCEL) 
                                                 ? "_" 
                                                 : ""), 
                                                ((this.FileType == ImportFileType.EXCEL) 
                                                 ? this.SheetName.Trim() 
                                                 : "")).ReplaceNonAlphaNumeric('_')
                                : this.ImportedTableName;
        // prepend the word "Imported_" and set the excel table name 
        this.ImportedData.TableName = string.Format("Imported_{0}", newTableName);
    }
  • CreateColumnHints() - This method is responsible for auto-building a "hints" list, which is used by the SQL query code to create the import table with field names and appropriate data types. If the inheriting class has already created a hints list (ostensibly via an overridden version of this method), this method simply exits without doing anything. The column hint related classes are described in the next section.
    C#
    /// <summary>
    /// Instantiate the hint list with one hint item for every column specified in the 
    /// data table columns. Then, the hint item's type is determined by iterating all 
    /// values for a given row/column to find the most appropriate type.
    /// </summary>
    /// <param name="fromCSV">True if importing from CSV file.</param>
    protected virtual void CreateColumnHints(bool fromCSV=false)
    {
        // if the programmer hasn't already specified a hints list
        if (this.ColumnHints == null || this.ColumnHints.Count == 0)
        {
            // instantiate
            this.ColumnHints = new ItemHints();
            // for each column in the Columns collection
            for (int i = 0; i < this.ImportedData.Columns.Count; i++)
            {
                // get the column
                DataColumn col = this.ImportedData.Columns[i];
                // if the name isn't null/empty (theoretically impossible, but we 
                // check simply because it's a good idea)
                if (!string.IsNullOrEmpty(col.ColumnName))
                {
                    // create a new hint item
                    HintItem hint = new HintItem(){ Name = col.ColumnName, ItemType=null};
                    // iterate each row
                    foreach(DataRow row in this.ImportedData.Rows)
                    {
                        // try to determine the best data type based on all of the 
                        // possible values
                        hint.DetermineType(row[col], fromCSV);
                        // if we determine at any point that the column should be a 
                        // string, we can quit because a string type is our ultimate 
                        // fallback data type.
                        if (hint.ItemType.Name.IsLike("String"))
                        {
                            break;
                        }
                    }
                    // add the hint to our list
                    this.ColumnHints.Add(hint);
                }
            }
        }
    }
  • EstablishStringLengths() - This method tries to establish the max length of string type columns. It uses the VarcharLengthMultiple property to determine the max length. So, using the default multiple of 255 characters, and the max length of all of the values for a given column was 300 characters, the resulting max string size for the column would be 510 characters.
    C#
    /// <summary>
    /// Determines the max necessary string length for all columns in the column hints 
    /// list. Lengths will be multiples of the value of VarcharLengthMultiple. By 
    /// default, this value is 255.
    /// </summary>
    protected virtual void EstablishStringLengths()
    {
        // if we have a hint list
        if (this.ColumnHints != null && this.ColumnHints.Count > 0)
        {
            // for each hint in the list
            for (int i = 1; i < this.ImportedData.Columns.Count; i++)
            {
                // it the hint represents a string
                if (this.ImportedData.Columns[i].DataType.Name.IsLike("string"))
                {
                    // calculate the length of largest string in the data
                    int maxSize = 0;
                    foreach(DataRow row in this.ImportedData.Rows)
                    {
                        maxSize = Math.Max(maxSize, (row.ItemArray[i] is DBNull) 
                                                    ? 0 
                                                    : ((string)(row.ItemArray[i])).Length);
                    }
                    // Determine how many times we have to mulitply our varchar 
                    // multiple value 
                    long rem = 0;
                    int factor = (int)(Math.DivRem((long)maxSize, (long)this.VarcharLengthMultiple, out rem));
                    factor += ((rem > 0) ? 1 : 0);
                    // Find the appropriate hint (we could assume an ordinal position, 
                    // but that's just crazy talk, and besides, there's no point in 
                    // tempting fate)
                    HintItem hint = this.ColumnHints.FirstOrDefault
                                    (x=>x.Name.ToUpper() == this.ImportedData.Columns[i].ColumnName.ToUpper());
                    // if we found the appropriatre hint, set the size (this is used 
                    // to build the CreateQuery)
                    if (hint != null)
                    {
                        hint.VarcharSize = this.VarcharLengthMultiple * factor;
                    }
                }
            }
        }
    }
  • BuildPrepQuery() - This method constructs the database preparation query, or in other words, whether to drop the import table completely or to just truncate it.
    C#
    /// <summary>
    /// Builds the database prep query. An empty string indicates append and nothing is done 
    /// to the table before-hand. Otherwise the table is either dropped and recreated, 
    /// or truncated.
    /// </summary>
    protected virtual void BuildPrepQuery()
    {
        // we're going to prep the table based on the programmer-specified setting
        switch (this.PrepAction)
        {
            case TablePrepAction.Drop :
                {
                    this.PrepQuery = string.Format("DROP TABLE [{0}].[dbo].[{1}];", 
                                                    this.InitialCatalog, 
                                                    this.ImportedData.TableName);
                }
                break;
            case TablePrepAction.Truncate :
                {
                    this.PrepQuery = string.Format("TRUNCATE TABLE [{0}].[dbo].[{1}];", 
                                                    this.InitialCatalog, 
                                                    this.ImportedData.TableName);
                }
                break;
        }
    }
  • BuildCreateTableQuery() - This method constructs the query that creates the table. It uses the column hints list discussed above to determine column names and data types. There are two overloads of this method. One creates a table with ALL of the columns in the imported data, and the other creates a table based on the columns specified in a SqlParameter array.
    C#
    /// <summary>
    /// Builds the create table query (if table does not exist, create it).This 
    /// method includes ALL fields from the loaded worksheet.
    /// </summary>
    protected virtual void BuildCreateTableQuery()
    {
        StringBuilder query = new StringBuilder();
        StringBuilder fields = new StringBuilder();
        query.AppendFormat("IF (OBJECT_ID('{0}','U') IS NULL ",
                             this.ImportedData.TableName).AppendLine();
        query.AppendFormat("    CREATE TABLE {0}.dbo.{1}", 
                            this.InitialCatalog, 
                            this.ImportedData.TableName).AppendLine();
        query.AppendLine("(");
        for (int i = 0; i < ImportedData.Columns.Count; i++)
        {
            string colName = Convert.ToString(this.ImportedData.Columns[i]);
            if (!string.IsNullOrEmpty(colName))
            {
                HintItem hint = this.ColumnHints.FirstOrDefault(x=>x.Name == colName);
                string typeName = (hint == null) 
                                    ? this.GetDbDataType(this.ImportedData.Columns[i]) 
                                    : this.GetDbDataType(hint.ItemType.Name);
                fields.AppendFormat("{0}[{1}] {2} NULL", 
                                    (fields.Length>0) ? "," : "", 
                                    colName, 
                                    typeName).AppendLine();
            }
        }
        query.Append(fields);
        query.Append(");");
        this.CreateQuery = query.ToString();
    }
    /// <summary>
    /// Builds the CREATE TABLE query using the fields as specified by the paramaters 
    /// list.
    /// </summary>
    /// <param name="parameters">List of sql parameters</param>
    protected virtual void BuildCreateTableQuery(SqlParameter[] parameters)
    {
        StringBuilder query = new StringBuilder();
        StringBuilder fields = new StringBuilder();
        query.AppendFormat("IF (OBJECT_ID('{0}','U') IS NULL ", 
                            this.ImportedData.TableName).AppendLine();
        query.AppendFormat("    CREATE TABLE {0}.dbo.{1}", 
                            this.InitialCatalog, 
                            this.ImportedData.TableName).AppendLine();
        query.AppendLine("(");
        // find all hints that are represented in the parameters list, and include 
        // them in the query
        List<hintitem> hints = this.ColumnHints.Where
                                     (a => parameters.Any(b => b.ParameterName == a.DbValue)).ToList();
        foreach(HintItem hint in hints)
        {
            string typeName = this.GetDbDataType(hint.ItemType.Name);
            fields.AppendFormat("{0}{1} {2} NULL", 
                                (fields.Length>0) ? "," : "", 
                                hint.DbName, 
                                typeName).AppendLine();
        }
        query.Append(fields);
        query.Append(");");
        this.CreateQuery = query.ToString();
    }
  • BuildInsertQuery() - This method constructs the insert query that inserts the imported data into the import table. Like the "create table" method above, there are two overloads, one that inserts ALL columns, and one that inserts columns specified in a SqlParameter array.
    C#
    /// <summary>
    /// Constructs the insert query we need to insert the imorted data into the sql 
    /// table. This method includes ALL fields from the loaded worksheet.
    /// </summary>
    protected virtual void BuildInsertQuery()
    {
    	StringBuilder query = new StringBuilder();
    	StringBuilder fields = new StringBuilder();
    	StringBuilder values = new StringBuilder();
    	// use the existing hints collection to determine the columns to be inserted.
    	foreach(HintItem hint in this.ColumnHints)
    	{
    		fields.AppendFormat("{0}{1}", 
    							(fields.Length>0) ? "," : "", 
    							hint.DbName).AppendLine();
    		values.AppendFormat ("{0}{1}", 
    								(fields.Length>0) ? "," : "", 
    								hint.DbValue).AppendLine();
    	}
    	query.AppendFormat("INSERT INTO [{0}].[dbo].[{1}]", 
    						this.InitialCatalog, 
    						this.ImportedData.TableName).AppendLine();
    	query.AppendLine("(");
    	query.Append(fields);
    	query.AppendLine(")").AppendLine("VALUES").AppendLine("(");
    	query.Append(values);
    	query.AppendLine(");");
    	this.InsertQuery = query.ToString();
    }
    /// <summary>
    /// Builds the INSERT query using the fields as specified by the paramaters list.
    /// </summary>
    /// <param name="parameters">List of sql parameters</param>
    protected virtual void BuildInsertQuery(SqlParameter[] parameters)
    {
    	StringBuilder query = new StringBuilder();
    	StringBuilder fields = new StringBuilder();
    	StringBuilder values = new StringBuilder();
    	// find all hints that are represented in the parameters list, and include 
    	// them in the query
    	List<hintitem> hints = this.ColumnHints.Where
                                    (a => parameters.Any(b => b.ParameterName == a.DbValue)).ToList();
        foreach(HintItem hint in hints)
    	{
    		fields.AppendFormat("{0}{1}", (fields.Length>0)?",":"", hint.DbName).AppendLine();
    		values.AppendFormat ("{0}{1}", (fields.Length>0)?",":"", hint.DbValue).AppendLine();
    	}
    	query.AppendFormat("INSERT INTO [{0}].[dbo].[{1}]", 
                            this.InitialCatalog, 
                            this.ImportedData.TableName).AppendLine();
        query.AppendLine("(");
    	query.Append(fields);
    	query.AppendLine(")").AppendLine("VALUES").AppendLine("(");
    	query.Append(values);
    	query.AppendLine(");");
    	this.InsertQuery = query.ToString();
    }

The remaining methods in the class are helper methods that determine what the database-equivalant data type is for the hinted columns, and actually calling the SQL queries that perform the database functionality.

HintItem and HintItemList

The HintItem class allows the importer class to automatically establish appropriate data types for imported columns. The most interesting part of the class is the DetermineType() method.

If the type is not already determined to be a string, we go ahead and try to parse it into a reasonable type. As you can see, whether we're importing an Excel or CSV file plays a part in type determination. Also, this code only supports one integer type (Int32). This is probably a deficiency that will eventually come back to bite me in the ass, but it should be easy to address if the need arises.

/// <summary>
/// Determines the data type of the column.
/// </summary>
/// <param name="value">The value to be examined</param>
/// <param name="fromCSV">True if this is being dne in a CSV importer.</param>
public void DetermineType(object value, bool fromCSV=false)
{
    // if the type isn't already a string
    if (this.ItemType == null || !this.ItemType.Name.IsLike("String"))
    {
        Type theType = this.ItemType;
        // we have to check if its from a CSV import becaus everything imported 
        // from a CSV defaults to string.
        if (value is string && !fromCSV)
        {
            theType = typeof(String);
        }
        else
        {
            string valueStr = value.ToString();
            DateTime parsedDateTime;
            Int32    parsedInt32;
            double   parsedDouble;
            bool     parsedBool;
            // We try to parse the value using all standard datetime formats 
            // (custom formats are not supported unless you want to modify the 
            // DateFormatStrings object.).
            if (DateTime.TryParseExact(valueStr, DateFormatStrings.Formats, CultureInfo.CurrentUICulture, DateTimeStyles.None, out parsedDateTime))
            {
                theType = typeof(DateTime);
            }
            else if (Int32.TryParse(valueStr, out parsedInt32))
            {
                // we only want to change the type if it's not already a double
                if (theType == null || !theType.Name.IsLike("Double"))
                {
                    theType = typeof(Int32);
                }
            }
            else if (double.TryParse(valueStr, out parsedDouble))
            {
                theType = typeof(Double);
            }
            else if (bool.TryParse(valueStr, out parsedBool))
            {
                theType = typeof(Boolean);
            }
            else 
            {
                theType = typeof(String);
            }
        }
        this.ItemType = theType;
    }
}

ExcelImportBase

The original implementation of this class contained everything in the ImportBase class, because I was only going to do an importer for Excel.

(Blunt honesty section) - Then, I figured I should do an importer for CSV files as well, and thought, "I'll just refer to the CSV file parser article I wrote last year, and let the programmer be a programmer and mold that article's code to his needs". After writing the Excel importer class, I figured that a certain group of programmers would be offended by my lazy attitude (and the fact that they would have to do some actual "work"), so I decided to mold that article myself (described in the next section). That's why we now have three abstract classes instead of just one.

This class is also abstract, meaning you have to inherit it with another class that YOU have to write. Since this is all about supporting programmer written packages for SQLXAgent, I'm assuming that's why you're going to be doing this, but it can certainly be used in other projects that you might dream up.

Since most of the code required is in the ImportBase class, there's very little that had to be done in this class. The most interesting part of this class is that I used a NuGet-available library, called ExcelDataReader (and its companion extension library, ExcelDataReader.DataSet), to facilitate the actual load-the-file requirement. This library reads an Excel file into a DataTable object. But that's all it does. It does not appear to make any attempt to determine appropriate data types, and if I recall, everything is stored in the table as an object. This was entirely unhelpful, and led me to develop the column hints code discussed above. The following is the entire contents of the class.

C#
/// <summary>
/// Base class for importing excel spreadsheets
/// </summary>
public abstract class ExcelImportBase : ImporterBase
{
    public ExcelImportBase(string filename, string sheetName, string connectionString):base(filename, sheetName, connectionString)
    {
        this.FileType  = ImportFileType.EXCEL;
        this.FileName  = filename;
        this.SheetName = sheetName;
        this.Init();
    }
    ~ExcelImportBase()
    {
    }
    /// <summary>
    /// Initialize this object with default property values.
    /// </summary>
    protected override void Init()
    {
        base.Init();
        this.FileType = ImportFileType.EXCEL;
    }
    /// <summary>
    /// Import the specified sheet from the file.
    /// </summary>
    public override void Import()
    {
        if (this.CanImport)
        {
            // load the worksheet from the file
            this.ImportedData  = this.ReadSheet(this.FileName, this.SheetName);
            // process the data
            base.ProcessData();
            // update the database
            this.UpdateDatabase();
        }
    }
    /// <summary>
    /// Reads the specified sheet and returns a DataTable object.
    /// </summary>
    /// <param name="filename">The name of the file conbtaining the desired worksheet</param>
    /// <param name="sheetName">The name of the worksheet from which to import</param>
    /// <returns>Datatable with content from source</returns>
    protected virtual DataTable ReadSheet(string filename, string sheetName)
    {
        DataSet ds = null;
        DataTable data  = null;
        try
        {
            using (System.IO.FileStream stream = File.Open(filename, FileMode.Open, FileAccess.Read))
            {
                using (var reader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream))
                {
                    do
                    {
                        if (reader.Name.IsLike(sheetName))
                        {
                            ds = reader.AsDataSet();
                        }
                    } while (reader.NextResult());
                }
                if (ds != null)
                {
                    data = ds.Tables[0];
                }
            }
        }
        catch (Exception ex)
        {
            throw new SQLXAgentException(SQLXExceptionEnum.ExcelDataReaderError, ex);
        }
        return data;
    }
    /// <summary>
    /// Abstract method that gets the data for the specified row
    /// </summary>
    /// <param name="rowIndex">The 0-based row index</param>
    /// <returns>An array of SqParameter objects</returns>
    protected abstract override SqlParameter[] GetRowParameters(int rowIndex);
}

The only abstract component of the class is the GetRowParameters() method. This is where you will specify which columns to actually import. Overriding this method with an empty method should cause all columns to be imported.

CSVImportBase

This class closely resembles my original CSV Parser article, and was modified to fit the new application-specific paradigm.

ATTENTION NEW PROGRAMMERS - this is how you make use of CodeProject articles written by someone else. You inspect the code to evaluate its fitness for use in your project, and if necessary, make changes that adapt it to your own usage. A well-written article will be all the guidance you need. Even if it's NOT well-written, take the time to evaluate the code, add necessary comments, and go on with your life. You'll be a better programmer simply by making a little extra effort on your own. And now, back to our story...

I had to retain the code responsible for loading and parsing of the data, but adapted the code to store the results in a DataTable object for use by the ImportBase class. Because there was so much involved in loading the file (I'm relying on my own code instead of having a library to do it for me), this class is noticeably larger than the ExcelImportBase class, and involved additional ExcelImportBase class, and involved additional CSV-only properties and methods (that were in the article from which I got the code). Instead of boring you with crap I wrote in the old article, I'll simply mention the new method I had to write to store the imported data into a DataTable object.

C#
/// <summary>
/// Adds a row to the datatable (initializes the columns if the datatable is new).
/// </summary>
/// <param name="parts"></param>
protected virtual void AddToDataTable(string[] parts)
{
    // Assume there's no header row, and default the column names to "Column_n" 
    // of type "object". We need to have columns specified before we can add 
    // new rows.
    if (this.ImportedData.Columns.Count == 0)
    {
        for (int i = 0; i < parts.Length; i++)
        {
            this.ImportedData.Columns.Add(string.Format("Column_{0}", i+1), typeof(object));
        }
    }
    // Create a new row. New rows contain n items in the ItemArray property. The 
    // number of items is determined by the number of columns the datatable has.
    DataRow row = this.ImportedData.NewRow();
    // now add our parts to the row
    for (int i = 0; i < parts.Length; i++)
    {
        string part = parts[i];
        if (!string.IsNullOrEmpty(this.CurrencySymbol))
        {
            part = (part.StartsWith(this.CurrencySymbol) || part.EndsWith(this.CurrencySymbol)) ? part.Replace(this.CurrencySymbol, "").Replace(",", "") : part;
        }
        row[i] = parts[i].Replace("\"","").Trim();
    }
    // add the new row to the datatable.
    this.ImportedData.Rows.Add(row);
}

Once the data table is populated, the ImportBase code kicks in to perform the rest of the import functionality.

Writing Your Own Importer

Since the three import classes discussed above are abstract, this means you are required to write your own importer class in order to perform an import operation. Since so much of the work is already done for you, this requires trivial effort on the part of the programmer.

Even though the two following classes are nearly identical, in the spirit of completeness, I included the all the code for each class.

Sample Excel Importer

C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data.SqlClient;
using System.Data;
using SQLXCommon;
using SQLXPkgBase;
namespace PkgSampleImportExcel2
{
    /// <summary>
    /// This class represents a sample of how you would code a class that inherits from 
    /// the abstract ExcelImportBase class (which is the class that does all of the heavy 
    /// lifting for the import excel worksheet process). All methods in the base class 
    /// are virtual, so you can override any of them that you deem necessary. It is the 
    /// responsibility of this derived class to provide success/failure information to 
    /// the calling DLL.
    /// </summary>
    public class SampleExcelImporter : ExcelImportBase
    {
        public SampleExcelImporter(string filename, string sheetName, string connectionstring)
               : base(filename, sheetName, connectionstring)
        {
            this.Init();
        }
        /// <summary>
        /// The Init method initializes all of the properties that determines how the 
        /// importer does its work. After calling base.Init(), set applicable properties 
        /// to desired values.
        /// </summary>
        protected override void Init()
        {
            // By default, the base importer code creates a table called "Imported_XXXXXX" 
            // (where "XXXXXX" is the name of the file). If you want to give it a name 
            // yourself, specify it AFTER the call to base.Init().
            base.Init();
            // TO-DO: Change the property values that make the importer behave as desired.
        }
        /// <summary>
        /// Creates the column hits used to define what columns are imported from the file 
        /// into the database. Of course, doing this manually requires knowledge of the 
        /// content of the file being imported. If 
        /// </summary>
        /// <param name="fromCSV">True if this is importing from a CSV file. Default value is false.</param>
        protected override void CreateColumnHints(bool fromCSV=false)
        {
            base.CreateColumnHints(fromCSV);
            // TO-DO: If you want to omit columns being imported, remove the hint items 
            //        here. You can remove columns beither by name, or by column index, 
            //        using code like you see below. If the indicated column doesn't 
            //        exist, no exception will be thrown.
            //this.ColumnHints.Remove("my column name");
        }
        /// <summary>
        /// This is an abstract method override which pulls the desired columns for 
        /// saving into the database.
        /// </summary>
        /// <param name="rowIndex">The index of the row for which data is being retrieved.</param>
        /// <returns>An array of SqlParameter objects</returns>
        /// <remarks>Do not put a try/catch block in this method. Instead, catch all 
        /// exceptions in the calling code so the return value and fail reason can 
        /// be set with a reasonable value.</remarks>
        protected override SqlParameter[] GetRowParameters(int rowIndex)
        {
            int      id;           
            string   comment;   
            DateTime theDate; 
            double   floatValue;
            int      intValue;
            string   ipAddr;
            // get your column values
            id         = (int)     this.GetColumnValue(rowIndex, "ID");
            comment    = (string)  this.GetColumnValue(rowIndex, "Comment");
            theDate    = (DateTime)this.GetColumnValue(rowIndex, "TheDate");
            floatValue = (double)  this.GetColumnValue(rowIndex, "FloatValue");
            intValue   = (int)     this.GetColumnValue(rowIndex, "IntValue");
            ipAddr     = (string)  this.GetColumnValue(rowIndex, "IPAddress");
            // Create the SqlParameter array (this is where you choose which fields 
            // will be used. This will affect the CREATE and INSERT queries.
            SqlParameter[] parameters = new SqlParameter[]
            {
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "ID"        ).DbValue, id),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "Comment"   ).DbValue, comment),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "TheDate"   ).DbValue, theDate),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "FloatValue").DbValue, floatValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IntValue"  ).DbValue, intValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IPAddress" ).DbValue, ipAddr),
            };
            return parameters;
        }
    }
}

Sample CSV Importer

C#
using SQLXPkgBase;
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace PkgSampleImportCSV
{
    /// <summary>
    /// This class represents a sample of how you would code a class that inherits from 
    /// the abstract CSVImportBase class (which is the class that does all of the heavy 
    /// lifting for the import excel worksheet process). All methods in the base class 
    /// are virtual, so you can override any of them that you deem necessary. It is the 
    /// responsibility of this derived class to provide success/failure information to 
    /// the calling DLL.
    /// </summary>
    public class SampleCSVImporter : CsvImportBase
    {
        public SampleCSVImporter(string filename, string connectionstring)
               : base(filename, connectionstring)
        {
            this.Init();			
        }
        /// <summary>
        /// The Init method initializes all of the properties that determines how the 
        /// importer does its work. After calling base.Init(), set applicable properties 
        /// to desired values.
        /// </summary>
        protected override void Init()
        {
            // By default, the base importer code creates a table called "Imported_XXXXXX" 
            // (where "XXXXXX" is the name of the file). If you want to give it a name 
            // yourself, specify it AFTER the call to base.Init().
            base.Init();
            // TO-DO: Change the property values that make the importer behave as desired.
        }
        /// <summary>
        /// Creates the column hits used to define what columns are imported from the file 
        /// into the database. Of course, doing this manually requires knowledge of the 
        /// content of the file being imported. If 
        /// </summary>
        /// <param name="fromCSV">True if this is importing from a CSV file. Default value is false.</param>
        protected override void CreateColumnHints(bool fromCSV=false)
        {
            base.CreateColumnHints(fromCSV);
            // TO-DO: If you want to omit columns being imported, remove the hint items 
            //        here. You can remove columns beither by name, or by column index, 
            //        using code like you see below. If the indicated column doesn't 
            //        exist, no exception will be thrown.
            //this.ColumnHints.Remove("my column name");
        }
        /// <summary>
        /// This is an abstract method override which pulls the desired columns for 
        /// saving into the database.
        /// </summary>
        /// <param name="rowIndex">The index of the row for which data is being retrieved.</param>
        /// <returns>An array of SqlParameter objects</returns>
        /// <remarks>Do not put a try/catch block in this method. Instead, catch all 
        /// exceptions in the calling code so the return value and fail reason can 
        /// be set with a reasonable value.</remarks>
        protected override SqlParameter[] GetRowParameters(int rowIndex)
        {
            int      id;           
            string   comment;   
            DateTime theDate; 
            double   floatValue;
            int      intValue;
            string   ipAddr;
            // get your column values
            id         = (int)     this.GetColumnValue(rowIndex, "ID");
            comment    = (string)  this.GetColumnValue(rowIndex, "Comment");
            theDate    = (DateTime)this.GetColumnValue(rowIndex, "TheDate");
            floatValue = (double)  this.GetColumnValue(rowIndex, "FloatValue");
            intValue   = (int)     this.GetColumnValue(rowIndex, "IntValue");
            ipAddr     = (string)  this.GetColumnValue(rowIndex, "IPAddress");
            // Create the SqlParameter array (this is where you choose which fields 
            // will be used. This will affect the CREATE and INSERT queries.
            SqlParameter[] parameters = new SqlParameter[]
            {
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "ID"        ).DbValue, id),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "Comment"   ).DbValue, comment),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "TheDate"   ).DbValue, theDate),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "FloatValue").DbValue, floatValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IntValue"  ).DbValue, intValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IPAddress" ).DbValue, ipAddr),
            };
            return parameters;
        }
    }
}

How to Include in Your Own Projects

I suspect that some of you will decide that you would like to utilize this code in your own projects. To that end, this section is for you. To use these files in your own projects, the absolute easiest way forward is as follows.

In Windows Explorer:

  • Copy the SQLXPkgBase folder to your solution folder.
     
  • Copy the SQLXCommon folder to your solution folder.
     
  • Rename the folders and files (optional) to your own naming scheme/preferences
     
  • Copy the 3rdParty folder to your solution folder.
     
  • The only thing you need from the 3rdParty folder (for this task) are the ExcelDataReader*.* files.
     

In Visual Studio:

  • Add the Common and PkgBase projects to your solution.
     
  • Resolve broken references in the (renamed) PkgBase project. The ExcelDataReader and SQLXCommon references are the ones that will be broken).
     
  • Resolve naming issues. Most of the problems will arise if you renamed the namespace in the Common project. Change all references to SQLXCommon to whatever you called your common namespace.
     
  • You can safely exclude the SQLXAgentPkgBase class from the (renamed) PkgBase project.
     
  • There are other ways to prune the code to its bare minimum, but I leave that was an exercise for the programmer.
     

In Closing

As I cited above, I only wrote the Excel importer because it made sense, given the purpose of a SQLXAgent package, and the fact that a lot of your typical DTSX packages in an enterprise environment involves importing data from an Excel file. (More brutal honesty) On the other hand, I wrote the CSV importer code simply to keep the ungrateful bastards out there in the audience from harping on me for providing what they might deem as an "incomplete solution". I make absolutely no apologies for my attitude in this regard. My general feeling is that if you want to be coddled and taken by the hand, go back to pre-school.

History

  • 29 Sep 2017 - Initial publication.
     

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)