Introduction
Every once in a while, you may need to query data from a DataTable
which has a hierarchical structure. In other words, the DataTable
contains a foreign key column which points to itself thus creating a self-referencing structure. A typical example is an organization chart where employees have a reference to the manager.
In order to iterate the hierarchical structure correctly, you need to have some kind of recursive function. For this, I decided to create a new method called AsTree
which would return the selected portion from the tree.
Implementing AsTree()
Before going to the actual implementation, let’s have a look at the return values. The AsTree
method returns a set of TreeItem
objects. A tree item looks as follows:
public class TreeItem {
public int Level { get; set; }
public DataRow Row { get; set; }
public DataRow ParentRow { get; set; }
}
Public Class TreeItem
Public Property Level As Integer
Public Property Row As DataRow
Public Property ParentRow As DataRow
End Class
The Row
property contains the actual row from the tree while the ParentRow
contains its parent DataRow
(if any). The Level
property contains the depth of the row in the tree calculated from the starting point.
The AsTree
extension method is very simple:
public static HashSet<TreeItem> AsTree(this EnumerableRowCollection<DataRow> source,
string parentColumnName,
string childColumnName,
DataRow startRow,
TraverseDirection direction = TraverseDirection.FromParentToChild) {
string actualChildColumn;
string actualParentColumn;
HashSet<TreeItem> resultSet = new HashSet<TreeItem>();
resultSet.Add(new TreeItem() {
Level = 0,
ParentRow = null,
Row = startRow
});
if (direction == TraverseDirection.FromParentToChild) {
actualParentColumn = parentColumnName;
actualChildColumn = childColumnName;
} else {
actualParentColumn = childColumnName;
actualChildColumn = parentColumnName;
}
return DataTableTree.FillTree(source, resultSet, actualParentColumn, actualChildColumn,
startRow, 1);
}
<Extension()>
Public Function AsTree(source As EnumerableRowCollection(Of DataRow), _
parentColumnName As String, _
childColumnName As String, _
startRow As DataRow, _
Optional direction As DataTableTree.TraverseDirection =
DataTableTree.TraverseDirection.FromParentToChild) As HashSet(Of DataTableTree.TreeItem)
Dim actualChildColumn As String
Dim actualParentColumn As String
Dim resultSet As HashSet(Of DataTableTree.TreeItem) = New HashSet(Of DataTableTree.TreeItem)()
resultSet.Add(New DataTableTree.TreeItem() With {
.Level = 0,
.ParentRow = Nothing,
.Row = startRow
})
If (direction = DataTableTree.TraverseDirection.FromParentToChild) Then
actualParentColumn = parentColumnName
actualChildColumn = childColumnName
Else
actualParentColumn = childColumnName
actualChildColumn = parentColumnName
End If
Return DataTableTreeModule.FillTree(source, resultSet, actualParentColumn, actualChildColumn,
startRow, 1)
End Function
The method uses parameters as follows:
parentColumnName
is the DataColumn
name which acts as a parent key childColumnName
is the foreign key column referencing to the parent column startRow
is the initial row from which the tree is investigated direction
defines the traverse direction, either top-down or bottom-up. The default is top-down.
The AsTree
method simply adds the first element into the HashSet
and then, based on the direction
, decides which column acts as a parent and which one is the child. The method has an assumption that the foreign key is created using a single column.
The actual iteration is implemented in the following FillTree
method:
private static HashSet<TreeItem> FillTree(EnumerableRowCollection<DataRow> source,
HashSet<TreeItem> resultSet,
string actualParentColumn,
string actualChildColumn,
System.Data.DataRow startRow,
int level) {
var subItems = from item in source
where item.Field<object>(actualChildColumn) != null
&& item.Field<object>(actualChildColumn) != System.DBNull.Value
&& item.Field<object>(actualChildColumn).Equals
(startRow.Field<object>(actualParentColumn))
select new TreeItem() {
Level = level,
Row = item,
ParentRow = startRow
};
foreach (TreeItem subItem in subItems) {
resultSet.Add(subItem);
DataTableTree.FillTree(source, resultSet, actualParentColumn, actualChildColumn,
subItem.Row, level + 1);
}
return resultSet;
}
Private Function FillTree(source As EnumerableRowCollection(Of DataRow), _
resultSet As HashSet(Of DataTableTree.TreeItem), _
actualParentColumn As String, _
actualChildColumn As String, _
startRow As System.Data.DataRow, _
level As Integer) As HashSet(Of DataTableTree.TreeItem)
Dim subItems = From item In source
Where Not (item.Field(Of Object)(actualChildColumn) Is Nothing) _
AndAlso Not (item.Field(Of Object)(actualChildColumn) Is System.DBNull.Value) _
AndAlso item.Field(Of Object)_
(actualChildColumn).Equals(startRow.Field(Of Object)(actualParentColumn))
Select New DataTableTree.TreeItem() With {
.Level = level,
.Row = item,
.ParentRow = startRow
}
For Each subItem As DataTableTree.TreeItem In subItems
resultSet.Add(subItem)
DataTableTreeModule.FillTree(source, resultSet, actualParentColumn, actualChildColumn, _
subItem.Row, level + 1)
Next subItem
Return resultSet
End Function
This method searches for immediate children for a given DataRow
and adds them into the HashSet
. For each child found, the FillTree
method is called recursively in order to add the children of the DataRow
at hand. On each call, the level
is incremented.
Usage Example
Now let’s have a few test runs. Imagine the following organization:
In order to define the structure for the organization, the following data table is created:
hierarchicalTable.Columns.Add("Id", typeof(int));
hierarchicalTable.Columns.Add("Name", typeof(string));
hierarchicalTable.Columns.Add("Title", typeof(string));
hierarchicalTable.Columns.Add("ManagerId", typeof(int));
hierarchicalTable.Constraints.Add(new System.Data.UniqueConstraint(
hierarchicalTable.Columns["Id"], true));
hierarchicalTable.Constraints.Add(new System.Data.ForeignKeyConstraint(
hierarchicalTable.Columns["Id"],
hierarchicalTable.Columns["ManagerId"]));
hierarchicalTable.Columns.Add("Id", System.Type.GetType("System.Int32"))
hierarchicalTable.Columns.Add("Name", System.Type.GetType("System.String"))
hierarchicalTable.Columns.Add("Title", System.Type.GetType("System.String"))
hierarchicalTable.Columns.Add("ManagerId", System.Type.GetType("System.Int32"))
hierarchicalTable.Constraints.Add(New UniqueConstraint( _
hierarchicalTable.Columns("Id"), True))
hierarchicalTable.Constraints.Add(New ForeignKeyConstraint( _
hierarchicalTable.Columns("Id"), _
hierarchicalTable.Columns("ManagerId")))
Each row has an Id
field acting as a primary key and a ManagerId
field which defines the manager for the person. The constraints wouldn't be necessary to use but they ensure that the parent key is present and that it is unique.
Our test data is filled like this:
hierarchicalTable.Rows.Add(1, "Edgar Allan Poe", "CEO", null);
hierarchicalTable.Rows.Add(2, "Blondie (Man with No Name)", "VP Sales", 1);
hierarchicalTable.Rows.Add(3, "Mary Poppins", "Sales Director", 2);
hierarchicalTable.Rows.Add(4, "Frodo Baggins", "Regional Sales Manager", 3);
hierarchicalTable.Rows.Add(5, "Bilbo Baggins", "Regional Sales Manager", 3);
hierarchicalTable.Rows.Add(6, "Smeagol", "Collections", 3);
hierarchicalTable.Rows.Add(7, "Euphegenia Doubtfire", "Sales Assistant", 2);
hierarchicalTable.Rows.Add(8, "Hermann Hesse", "VP R&D", 1);
hierarchicalTable.Rows.Add(9, "Tom Sawyer", "Researcher", 8);
hierarchicalTable.Rows.Add(10, "Isaac Asimov", "Futurist", 8);
hierarchicalTable.Rows.Add(11, "Mini-Me", "R&D Assistant", 8);
hierarchicalTable.Rows.Add(1, "Edgar Allan Poe", "CEO", Nothing)
hierarchicalTable.Rows.Add(2, "Blondie (Man with No Name)", "VP Sales", 1)
hierarchicalTable.Rows.Add(3, "Mary Poppins", "Sales Director", 2)
hierarchicalTable.Rows.Add(4, "Frodo Baggins", "Regional Sales Manager", 3)
hierarchicalTable.Rows.Add(5, "Bilbo Baggins", "Regional Sales Manager", 3)
hierarchicalTable.Rows.Add(6, "Smeagol", "Collections", 3)
hierarchicalTable.Rows.Add(7, "Euphegenia Doubtfire", "Sales Assistant", 2)
hierarchicalTable.Rows.Add(8, "Hermann Hesse", "VP R&D", 1)
hierarchicalTable.Rows.Add(9, "Tom Sawyer", "Researcher", 8)
hierarchicalTable.Rows.Add(10, "Isaac Asimov", "Futurist", 8)
hierarchicalTable.Rows.Add(11, "Mini-Me", "R&D Assistant", 8)
Full Tree
Now in order to see the whole tree, let’s first run a query from the top without any restrictions.
startRow = hierarchicalTable.AsEnumerable().Where
(x => x.Field<int>("Id") == 1).FirstOrDefault();
var query1 = from person in hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
select person;
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("Tree for {0}", startRow["Name"]));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query1) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2}",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"]));
}
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of Integer)("Id") = 1).FirstOrDefault()
Dim query1 = From person In hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
Select person
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("Tree for {0}", startRow("Name")))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query1
System.Console.WriteLine(String.Format(" Level {0}: {1}{2}",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name")))
Next item
First of all, the starting row is defined. I've selected the top-most row based on the id, 1
. After that, the actual LINQ query is defined. In order to illustrate the results, each row found is printed to the console with small amount of formatting so the result would look like this:
=============================================
Tree for Edgar Allan Poe
=============================================
Level 0: Edgar Allan Poe
Level 1: Blondie (Man with No Name)
Level 2: Mary Poppins
Level 3: Frodo Baggins
Level 3: Bilbo Baggins
Level 3: Smeagol
Level 2: Euphegenia Doubtfire
Level 1: Hermann Hesse
Level 2: Tom Sawyer
Level 2: Isaac Asimov
Level 2: Mini-Me
Partial Tree
The next test is to query the tree partially. If we want to select all rows starting from Blondie, we simply need to define another starting row based on the Id
(2
). So the code would look like:
startRow = hierarchicalTable.AsEnumerable().Where(x => x.Field<int>("Id") == 2).FirstOrDefault();
var query2 = from person in hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
select person;
System.Console.WriteLine();
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("Tree for {0}", startRow["Name"]));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query2) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2}",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"]));
}
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of Integer)("Id") = 2).FirstOrDefault()
Dim query2 = From person In hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
Select person
System.Console.WriteLine()
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("Tree for {0}", startRow("Name")))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query2
System.Console.WriteLine(String.Format(" Level {0}: {1}{2}",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name")))
Next item
And the result would be:
=============================================
Tree for Blondie (Man with No Name)
=============================================
Level 0: Blondie (Man with No Name)
Level 1: Mary Poppins
Level 2: Frodo Baggins
Level 2: Bilbo Baggins
Level 2: Smeagol
Level 1: Euphegenia Doubtfire
Note that the levels are now different since the 0
level is always the row where the query is started from.
Bottom-up Query
So far, we've made only top-down queries. The AsTree
had a parameter which could be used to define the direction
for the traversal. So this time, let's have a look at the relations starting from Smeagol. The code would look like:
startRow = hierarchicalTable.AsEnumerable().Where
(x => x.Field<string>("Name") == "Smeagol").FirstOrDefault();
var query3 = from person in hierarchicalTable.AsEnumerable().AsTree
("Id", "ManagerId", startRow, DataTableTree.TraverseDirection.FromChildToParent)
select person;
System.Console.WriteLine();
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("Reverse tree for {0}", startRow["Name"]));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query3) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2}",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"]));
}
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of String)("Name") = "Smeagol").FirstOrDefault()
Dim query3 = From person In hierarchicalTable.AsEnumerable().AsTree_
("Id", "ManagerId", startRow, DataTableTree.TraverseDirection.FromChildToParent)
Select person
System.Console.WriteLine()
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("Reverse tree for {0}", startRow("Name")))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query3
System.Console.WriteLine(String.Format(" Level {0}: {1}{2}",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name")))
Next item
Now the result is:
=============================================
Reverse tree for Smeagol
=============================================
Level 0: Smeagol
Level 1: Mary Poppins
Level 2: Blondie (Man with No Name)
Level 3: Edgar Allan Poe
Restrict the Results in the Query
The last test is to restrict the results in the LINQ query, just to make sure everything works as expected. For example, if we want to list all the assistants and their managers, the code could look like:
startRow = hierarchicalTable.AsEnumerable().Where
(x => x.Field<int>("Id") == 1).FirstOrDefault();
var query4 = from person in hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
where person.Row.Field<string>("Title").Contains("Assistant")
select person;
System.Console.WriteLine();
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("All assistants and their managers"));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query4) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2} (manager: {3})",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"],
item.ParentRow["Name"]));
}
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of Integer)("Id") = 1).FirstOrDefault()
Dim query4 = From person In hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
Where person.Row.Field(Of String)("Title").Contains("Assistant")
Select person
System.Console.WriteLine()
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("All assistants and their managers"))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query4
System.Console.WriteLine(String.Format(" Level {0}: {1}{2} (manager: {3})",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name"),
item.ParentRow("Name")))
Next item
And now the result would be:
=============================================
All assistants and their managers
=============================================
Level 2: Euphegenia Doubtfire (manager: Blondie (Man with No Name))
Level 2: Mini-Me (manager: Hermann Hesse)
Conclusions
Querying a hierarchical data table is actually very simple. However, the code provided doesn't cover all possible scenarios, so feel free to modify it to better suit your needs.
History
- 5th October, 2015: Created