Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Combining DataTables with Different Schema

4.33/5 (3 votes)
21 Nov 2017CPOL2 min read 27.7K   189  
There is a problem with accessing Excel files via OleDb which contain more than 255 columns: they are truncated and need to be fetched using ranges. But what do you do with DataTables after that?

Introduction

This came up because there is a problem in QA with OleDb access to Excel files which means that only 255 columns are fetched at a time. That's not difficult to work around - use Ranges to fetch the separate columns into different tables. But ... what then? It's a pain to handle two (or more) separate tables in order to process a single row, and the DataTable.Merge method combines two tables with the same schema to provide more rows, not more columns.

This tip provides a "combine columns" method to "merge" tables producing the same number of rows, but more columns. Do note that it will fail if the column names overlap - It wouldn't be difficult to add code to get round that, but it would slow the process so I'll leave it as an exercise for the reader.

The Code that Does the Work

C#
/// <summary>
/// Merge a list of tables into a single table with more columns.
/// </summary>
/// <remarks>
/// This isn't the most efficient method, I'm sure - but the code
/// would get seriously incomprehensible to do it all via one
/// single Linq method call. For the sake of readability and
/// maintainability I'll live with the inefficiency.
/// </remarks>
/// <param name="baseTable"></param>
/// <param name="additionalTables"></param>
/// <returns></returns>
public static DataTable MergeTables(DataTable baseTable, params DataTable[] additionalTables)
    {
    // Build combined table columns
    DataTable merged = baseTable;
    foreach (DataTable dt in additionalTables)
        {
        merged = AddTable(merged, dt);
        }
    return merged;
    }
/// <summary>
/// Merge two tables into a single table with more columns.
/// </summary>
/// <param name="baseTable"></param>
/// <param name="additionalTable"></param>
/// <returns></returns>
public static DataTable AddTable(DataTable baseTable, DataTable additionalTable)
    {
    // Build combined table columns
    DataTable merged = baseTable.Clone();                  // Include all columns from base table in result.
    foreach (DataColumn col in additionalTable.Columns)
        {
        string newColumnName = col.ColumnName;
        merged.Columns.Add(newColumnName, col.DataType);
        }
    // Add all rows from both tables
    var bt = baseTable.AsEnumerable();
    var at = additionalTable.AsEnumerable();
    var mergedRows = bt.Zip(at, (r1, r2) => r1.ItemArray.Concat(r2.ItemArray).ToArray());
    foreach (object[] rowFields in mergedRows)
        {
        merged.Rows.Add(rowFields);
        }
    return merged;
    }

Only one line is complicated:

C#
var mergedRows = bt.Zip(at, (r1, r2) => r1.ItemArray.Concat(r2.ItemArray).ToArray());

Zip just executes a Lambda on each "pair" of items: a DataRow from the base table, and a DataRow from the additional table.

The Lambda just concatenates the two item collections into a single collection and converts it into an array of objects. The array just makes it easier to describe the collection content in the following loop: using a var there confuses the compiler a bit. (Now you see why I do this the inefficient way!)

Using the Code

It's pretty obvious how to use it - you need a few tables:

ID Name Email
1 Mike Mike@MyDomain.com
2 Susan Susan@MyDomain.com

 

City Country
London England
Cardiff Wales

 

Favourite Colour Favourite Fruit
Green Apple
Yellow Banana

As in:

SQL
DataTable dt1 = new DataTable();
dt1.Columns.Add("ID", typeof(int));
dt1.Columns.Add("Name", typeof(string));
dt1.Columns.Add("Email", typeof(string));
dt1.Rows.Add(1, "Mike", "Mike@MyDomain.com");
dt1.Rows.Add(2, "Susan", "Susan@MyDomain.com");
var dt2 = new DataTable();
dt2.Columns.Add("City", typeof(string));
dt2.Columns.Add("Country", typeof(string));
dt2.Rows.Add("London", "England");
dt2.Rows.Add("Cardiff", "Wales");
var dt3 = new DataTable();
dt3.Columns.Add("Favourite Colour", typeof(string));
dt3.Columns.Add("Favourite Fruit", typeof(string));
dt3.Rows.Add("Green", "Apple");
dt3.Rows.Add("Yellow", "Banana");

Then just combine the tables:

C#
DataTable dtMerged = MergeTables(dt1, dt2, dt3);
ID Name Email City Country Favourite Colour Favourite Fruit
1 Mike Mike@MyDomain.com London England Green Apple
2 Susan Susan@MyDomain.com Cardiff Wales Yellow Banana

History

  • 2017-11-21 First version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)