Introduction
In this tip, we will see how to perform case sensitive comparison of datatables.
Background
In my project, there came a situation where it is needed to perform Case Sensitive comparison of columns between datatables. I am sharing my learnings here.
Using the Code
I am going to have data in two datatables. Let's call one datatable as Source
and the other as Target
.
I will be using Windows Forms application for this demo. I created a form with 5 data gridviews and each show:
- displays data from source datatable
- displays data from second datatable
- displays matching source rows from Case-Insensitive comparison between source and target
- displays matching source row from Case-Sensitive comparison between source and target and
- displays mismatching source row from Case-Sensitive comparison between source and target
I made it in such a way that when the form is loaded initially, it will display the data from source and target. If we see the loaded data, all columns are the same except for Street of second row. Source has "TestStreet2
" and Target has "TESTStreet2
". TEST is all caps in Target.
Comparison will be performed and results will be displayed when "Compare and Display Result" button is clicked. Before seeing the screenshot with results, let's see what will be the possible result.
Case-Insensitive comparison should return both source records, Case-Sensitive comparison should return only the first record from source and Case-Sensitive mismatch should return only the second record from source.
Now let's look at the screenshots.
I found the LINQ query for column by column comparison from the below article:
It is needed to specify the columns to be compared. In this example, I am comparing all the columns. The below code will perform case insensitive comparison of columns.
List<string> columnsToCompare = new List<string>() { "FirstName", "LastName", "Street", "City" };
dtSource.PrimaryKey = columnsToCompare.Select(x => dtSource.Columns[x]).ToArray();
dtTarget.PrimaryKey = columnsToCompare.Select(y => dtTarget.Columns[y]).ToArray();
var matchingRows = from DataRow rowFile in dtSource.Rows
where dtTarget.Rows.Contains(rowFile.ItemArray.Where(
(a, b) => columnsToCompare.Contains(dtSource.Columns[b].ColumnName)).ToArray())
select rowFile;
dtSource.CaseSensitive = true;
dtTarget.CaseSensitive = true;
columnsToCompare
is a list of string
s which contains the columns to be compared. If you do not want any column to be compared, just remove it from the list.
The next two lines set the Primary key(s) for the tables. But these are not actual primary keys but columns to be compared. I had my share of confusion before trying this out myself.
In order to perform case sensitive comparison, the main part of this article, we have to set the case sensitivity of the datatables to true
. The below code does the trick.
dtSource.CaseSensitive = true;
dtTarget.CaseSensitive = true;
Run the code to compare after this.
var matchingRows = from DataRow rowFile in dtSource.Rows
where dtTarget.Rows.Contains(rowFile.ItemArray.Where(
(a, b) => columnsToCompare.Contains(dtSource.Columns[b].ColumnName)).ToArray())
select rowFile;
Points of Interest
In this method of performing the comparison, the order of the columns in source and target must match to get the desired result. This method will not support NULL
values in columns. Null
reference exception will be thrown.
I did not add the LINQ query for finding the mismatching rows. I leave it for you to figure out.
History
- First version - 03/21/2015