Introduction
Several times I've seen questions similar to:
- How to get data into datatable from multiple Excel sources using ADO.NET?
- How to join data from multiple workbooks using single
OleDbConnection
object? - How to read data from multiple workbooks using into a single
Datatable
object?
This tip shows how to achieve that.
On the other side, sometimes we need to compare data stored in 2 different Excel files. The most popular solution for that is to loop through the data in both files using Interop[^].
The pseudo-code is as follows:
For each row in FirstFile
For each row in SecondFile
'comparison is coming here
Next
Next
Definitely it is an inefficient way! So, what to do? In that case, I would strongly recommend to use ADO.NET (OleDb)!
Background
There are two interesting articles on MSDN about using ADO.NET with Excel (Office) application:
- Accessing MS Office Data from .NET applications (VB.NET)
- How to use ADO.NET to retrieve and modify records in an Excel workbook (VB.NET)
This is not a complete list of useful articles, of course, but the most recommended.
You'll find many useful articles in the CodeProject Knowledge Base too. But nowhere have I found a way to compare data or join/read data from multiple Excel files using single OleDb connection.
Idea
The main idea is to use IN clause. OleDb provider for MS JET database engine supports IN
clause in the same way as Microsoft Access database does.
SELECT *
FROM [Sheet1$] IN 'D:\SampleFile.xls' 'Excel 8.0;'
Using the Code
Below is the complete C# and VB.NET code for ConsoleApplication
.
using System;
using System.Data;
using System.Data.OleDb;
using System.Collections.Generic;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
OleDbConnection oConn = null;
OleDbCommand oComm = null;
OleDbDataReader oRdr = null;
DataTable oTbl = null;
String sFirstFile = String.Empty;
String sSecondFile= String.Empty;
String sConnString = String.Empty;
String sCommand = String.Empty;
try
{
sFirstFile = @"D:\Fruits1.xls";
sSecondFile = @"D:\Fruits2.xls";
sConnString = @"Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=" + sFirstFile + ";Extended Properties='Excel 8.0;HDR=Yes';";
oConn = new OleDbConnection(sConnString);
oConn.Open();
sCommand = @"SELECT NameOfFruit" + Environment.NewLine +
"FROM [Fruits$]" + Environment.NewLine +
"UNION ALL" + Environment.NewLine +
"SELECT NameOfFruit" + Environment.NewLine +
"FROM [Fruits$] IN '" + sSecondFile + "' 'Excel 8.0;';";
oComm = new OleDbCommand(sCommand, oConn);
oRdr = oComm.ExecuteReader();
oTbl = new DataTable();
oTbl.Load(oRdr);
foreach (DataRow row in oTbl.Rows)
{
Console.WriteLine(row["NameOfFruit"].ToString());
}
Console.ReadKey();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
Console.ReadKey();
}
finally
{
if (oRdr!=null) oRdr.Close();
oRdr = null;
if (oTbl !=null) oTbl.Dispose();
oComm.Dispose();
oConn.Close();
oConn.Dispose();
}
}
}
}
Module Module1
Sub Main()
Dim oConn As OleDb.OleDbConnection = Nothing
Dim oComm As OleDb.OleDbCommand = Nothing
Dim oRdr As OleDb.OleDbDataReader = Nothing
Dim oTbl As Data.DataTable = Nothing
Dim sFirstFile As String = String.Empty
Dim sSecondFile As String = String.Empty
Try
sFirstFile = "D:\Fruits1.xls"
sSecondFile = "D:\Fruits2.xls"
oConn = New OleDb.OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;_
Data Source=" & sFirstFile & ";Extended Properties='Excel 8.0;HDR=Yes';")
oConn.Open()
oComm = New OleDb.OleDbCommand("SELECT NameOfFruit" & vbCr & _
"FROM [Fruits$]" & vbCr & _
"UNION ALL" & vbCr & _
"SELECT NameOfFruit" & vbCr & _
"FROM [Fruits$] IN '" & _
sSecondFile & "' 'Excel 8.0;';", oConn)
oRdr = oComm.ExecuteReader()
oTbl = New DataTable
oTbl.Load(oRdr)
For Each row As DataRow In oTbl.Rows
Console.WriteLine(row.Item("NameOfFruit").ToString())
Next
Console.ReadKey()
Catch ex As Exception
Console.WriteLine(ex.Message)
Console.ReadKey()
Finally
If Not oRdr Is Nothing Then oRdr.Close()
oRdr = Nothing
If Not oTbl Is Nothing Then oTbl.Dispose()
oComm.Dispose()
oConn.Close()
oConn.Dispose()
End Try
End Sub
End Module
As you can see, there is only one OleDbConnection
and one OleDbCommand
to retrieve data from 2 different workbooks.
How to Compare Data Stored in Different Workbooks?
Using the same method as above but with different query:
SELECT NameOfFruit
FROM [Fruits$]
WHERE NameOfFruit NOT IN (SELECT NameOfFruit _
FROM [Fruits$] IN 'D\Data\Fruits2.xls' 'Excel 8.0;');
Using the above query, you'll fetch unique data from Fruit1.xls
file. Equal data are ignored.
Points of Interest
Using OleDb to fetch data or compare data stored in two different workbooks is:
- faster than using Interop
- more universal (does not require installation of Microsoft Excel)
History
- 2014/01/03 - First version (misspelling corrected)
- 2013/12/27 - First version