Introduction
This article outlines a specific technique to get depth-wise ordering from a tree in a database if the tree is being kept track of with a modified Adjacency List model. If you are not already familiar with storing trees in a database, I refer to two common structures and good articles that explain how to use them.
Background
A modified Adjacency List model is a well-known structure to hold hierarchical data or trees in SQL. It is nicely presented in "Trees in SQL databases" by Eugene Lepekhin. I am not going to go into detail about this structure. Let's just identify the basic elements of it. There is the base table, which has the hierarchical relationship, a Parent-Child relationship, and then there is a separate table which holds all of the Parent-Child relationships and the distance between them. All of the relationships here means, it not only has the Children of the Parent, but also the Grand Children and Great Grand Children, etc. The tables can look like this. [I am going to use the same naming as Lepekhin. You can and should refer to his article for the details if you are not already familiar with this.]
Here is a tree to refer to as an example for the rest of the article:
The tables would contain values for the sample tree, like this:
The main selection that you have to do with hierarchical data is to retrieve a subtree. The SQL code would look something like this:
Select n.* From Node n, Tree t Where n.NodeId=t.ChildId and t.ParentId=B
Problem
Now, the problem that I faced was how to order the data retrieved in that subtree.
For a tree, there are two standard ways to order the nodes: breadth-wise and depth-wise. Breadth-wise means that you want to go across before going down. Here, it is simply a matter of ordering on the Level
column in the Tree table. A breadth-wise ordering of our example would return: A, B, E, C, D, F.
Here's the code to do that:
Select n.* From Node n, Tree t Where n.NodeId=t.ChildId and t.ParentId=A Order By t.Level
The real problem comes when you want to do a depth-wise ordering. Depth-wise ordering means that the tree is retrieved going to the bottom of each branch before going across to the next branch. Using our example, a depth-wise ordering would return: A, B, C, D, F, E.
Solution
The solution that I found is to use a Path Enumeration column in the base table and order by it. This column keeps track of the path to the current node from the root. This is also a well-known technique that can be found in "Hierarchical SQL" by Joe Celko. It is itself a way to keep track of hierarchical data in a database, and can be used as an alternative to the modified Adjacency List mode. But, you wouldn't want to use it this way because pattern matching of strings is very slow. The mechanics of Path Enumeration can be handled in triggers, like the Adjacency List model, which I will leave up to the reader who can refer to Celko's article.
Our new Node table now looks like this:
Using our example tree from above, here are the values of Node (depending on some implementation choices you make with regards to the format of the path):
With the Path Enumeration column in place, it can be used to order the contents of the tree depth-wise. The only addition to the selection is to order by NodePath
, and the data set retrieved is depth-wise ordered. Ordering on this string will ensure that each branch at each level is presented in order, based on the value of the ID.
Select n.* From Node n, Tree t Where n.NodeId=t.ChildId and t.ParentId=A
Order By n.NodePath
I am not aware of these two models for hierarchical data being used together like this, but it easily solves this particular problem. This technique is very useful in situations where you don't mind a little extra overhead on insertions and updates due to the triggers in order to get speed on the selection.