Introduction
The code snippet in this article intendes to explain
- How to invoke Linq queries on DataRowCollection
- How to use group by clause using multiple dynamic columns/keys over DataRowCollection in memory
Using the code
1 - How to invoke LINQ queries on DataRowCollection
.net framework by default does not contains extension methods that supports invoking LINQ Queries over DataRowCollection thus datatable/dataset by design does not support LINQ queries out of the box (atleast in .net 3.5).
As LINQ queries are nothing but an extension we can over come this limitation by writing a wrapper of our own that could enumerate over DataRowCollection thus allowing us to invoke LINQ Queries
Class EnumerableDataRows (see code below) allows us to do the same, the constructor takes the IEnumerable DataRowCollection and the enumerator basically returns the data row one at a time on demand
class EnumerableDataRows<T> : IEnumerable<T>, Enumerable
{
IEnumerable dataRows;
EnumerableDataRowList(IEnumerable items)
{
dataRows = items;
}
IEnumerator<T> IEnumerable<T>.GetEnumerator()
{
foreach(T dataRow in dataRows)
{
yield return dataRow;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
IEnumerable<T> iEnumerable = this;
return iEnumerable.GetEnumerator();
}
}
The following is a sample data table containing four columns namely Name, Age, Height and Diet
static DataTable GetTable
{
get
{
DataTable dataTable = new DataTable("MyTable");
dataTable.Columns.Add("Name");
dataTable.Columns.Add("Age");
dataTable.Columns.Add("Height");
dataTable.Columns.Add("Diet");
dataTable.Rows.Add("Bob", "28", "170", "Vegan");
dataTable.Rows.Add("Michael", "28", "210", "Vegan");
dataTable.Rows.Add("James", "28", "190", "NonVegan");
dataTable.Rows.Add("George", "28", "200", "NonVegan");
return dataTable;
}
}
With the EnumerableDataRows class we can make the table's DataRowCollection enumerable thus enabling LINQ
EnumerableDataRowList<DataRow> enumerableRowCollection = new EnumerableDataRowList<DataRow>(Table.Rows);
This completes our first objective, we can now invoke LINQ Queries on enumerableRowCollection
var groupedRows = from row in enumerableRowCollection group row by row["Age"];
2 - How to use group by clause using multiple dynamic columns/keys over DataRowCollection in memory
LINQ by default does not support grouping over multiple columns for in-memory objects (datatable in this example), as we can do in SQL. We can achieve the same in multiple ways such as writing multiple group by clause recursively for the various keys or by creating an in memory buckets to group programmatically. In this topic I am going to explain the way I feel is a better idea in terms of implementation or comprehensibility (I'd love to hear your comments if you think this a bad way etc).
The way I have implemented multiple column gouping using LINQ is by string concatenation i.e. keys are devised based on concatenation result
Logic: if we are grouping on age and diet columns, then concatenate both age and diet value(s) as one string and group based on the new string. The following function does exactly the same, it basically iterates over the dynamic list of columns, retrieve the column values and returns the concatenated string
public static String GroupData(DataRow dataRow)
{
String[] columnNames = new[] { "Age", "Diet" };
stringBuilder.Remove(0, stringBuilder.Length);
foreach (String column in columnNames)
{
stringBuilder.Append(dataRow[column].ToString());
}
return stringBuilder.ToString();
}
With our first objective achieved we can now group on multiple columns using the above logic
Func<DataRow, String> groupingFunction = GroupData;
var groupedDataRow = enumerableRowCollection.GroupBy(groupingFunction);
On execution the datatable is grouped something like this:
Grouped by -> 28Vegan
Items -> Bob
Items -> Michael
----
Grouped by -> 28NonVegan
Items -> James
Items -> George
----
Cool ... we are done with the two objectives above.
The following is the entire code snippet for your reference ... Enjoy and have fun!!!
using System;
using System.Text;
using System.Linq;
using System.Data;
using System.Collections.Generic;
using System.Collections;
using System.Linq.Expressions;
using System.Xml.Linq;
class EnumerableDataRowList<T> : IEnumerable<T>, IEnumerable
{
IEnumerable dataRows;
internal EnumerableDataRowList(IEnumerable items)
{
dataRows = items;
}
IEnumerator<T> IEnumerable<T>.GetEnumerator()
{
foreach (T dataRow in dataRows)
yield return dataRow;
}
IEnumerator IEnumerable.GetEnumerator()
{
IEnumerable<T> iEnumerable = this;
return iEnumerable.GetEnumerator();
}
}
public class mainclass
{
static StringBuilder stringBuilder = new StringBuilder();
public static String GroupData(DataRow dataRow)
{
String[] columnNames = new[] { "Age", "Diet" };
stringBuilder.Remove(0, stringBuilder.Length);
foreach (String column in columnNames)
{
stringBuilder.Append(dataRow[column].ToString());
}
return stringBuilder.ToString();
}
static DataTable Table
{
get
{
DataTable dataTable = new DataTable("MyTable");
dataTable.Columns.Add("Name");
dataTable.Columns.Add("Age");
dataTable.Columns.Add("Height");
dataTable.Columns.Add("Diet");
dataTable.Rows.Add("Bob", "28", "170", "Vegan");
dataTable.Rows.Add("Michael", "28", "210", "Vegan");
dataTable.Rows.Add("James", "28", "190", "NonVegan");
dataTable.Rows.Add("George", "28", "200", "NonVegan");
return dataTable;
}
}
public static void Main()
{
EnumerableDataRowList<DataRow> enumerableRowCollection = new EnumerableDataRowList<DataRow>(Table.Rows);
Func<DataRow, String> groupingFunction = GroupData;
var groupedDataRow = enumerableRowCollection.GroupBy(groupingFunction);
foreach (var keys in groupedDataRow)
{
Console.WriteLine(String.Format("Grouped by -> {0}",keys.Key));
foreach (var item in keys)
{
Console.WriteLine(String.Format(" Item -> {0}", item["Name"]));
}
Console.WriteLine("----");
}
Console.ReadKey();
}
}