(untagged)

Generic Tree Container in C# 2.0

Max Hajek (aka AzazelDev)

0.00/5 (No votes)

11 Aug 2004

A small library implementing a generic tree container showing off generics and iterators.

Introduction

With C# 2.0, a powerful language feature is being introduced which all C++ programmers have been missing since the beginnings of this altogether so cleanly and beautifully designed language: generics. This article uses this new feature to implement a generic tree class. Tree structures have more than just one natural way of being traversed, this is also a good motivation to demonstrate the use of C# 2.0's new iterator feature. If you wonder how you can build the code or how you can start trying out the new language features, go look at Microsoft's site where you can download a beta of their new Express line of developers products.

Background

For a general introduction into the topic of generics and/or iterators, please take a look at the specification of C# 2.0 by Microsoft which does a good job on explaining them in detail.

Usage

When using the code, you will be dealing with two classes: Node and NodeCollection. Before going into the details of the implementation, it's probably a good idea to scan the demo project which comes with the source archive. The demo project is a console application which takes a single parameter. This parameter should be a path pointing to a directory. The demo first loads the file hierarchy rooted inside that path into an internally kept tree structure using the Node class, and then iterates over this tree structure once depth-first, then breadth-first. In the demo's Run() method, you see how to specialize the Node class for a given type, and construct an instance:

void Run(string rootdir)
{
    Node<string> root = new Node<string>(rootdir);

The code above creates an instance of the Node class specialized for containing data of type string and initializes the instance with a value. The Node class contains a property Data which allows read-only access to the element type for which it is specialized. Exactly this property is what you initialize in the constructor code. If you want to access the children of a node, you use the Children property which returns an instance of NodeCollection. Each node has a Parent property, too, which is read-only. If you add a node as the child of another, the first node's Parent property will reflect this change. Newly constructed nodes return null as their Parent, meaning they are located in the root of the hierarchy. The NodeCollection, on the other hand, has a read-only property Owner returning the Node whose children it contains. NodeCollection is not meant to be used as a general-purpose container for Nodes, thus its constructor is not publicly accessible. Node contains other useful properties and methods: the Root property gets the root of a node (wouldn't you have guessed ;-)), IsAncestorOf() takes another node and returns whether the instance node is an ancestor of the passed node, IsDescendantOf() checks for exactly the opposite condition, and DoesShareHierarchyWith() takes another node and returns whether the instance node and the passed node are equal or have an ancestor-descendant relationship (true) or not (false). GetDepthFirstEnumerator() and GetBreadthFirstEnumerator() return iterators performing a depth-first or breadth-first traversal. Note that the iterators returned by these methods iterate over the Data values of the nodes, not the nodes themselves, meaning that .Current will return the value of Data of the currently visited node.

Implementation

The implementation of this small library makes use of two new features of C# 2.0: generics and iterators. Generics are used to allow the nodes to function as a type-safe container. Without them, the Data property would have to be of type object, requiring clients to cast to the correct type and allowing to instantiate Node with objects of different types. Iterators, on the other hand, are used for implementing depth-first and breadth-first traversal in a compellingly intuitive way. Let's take a look at their implementation:

        public IEnumerator<Element> GetDepthFirstEnumerator()
        {
            yield return mData;
            foreach (Node<Element> kid in mChildren)
            {
                IEnumerator<Element> kidenumerator= kid.GetDepthFirstEnumerator();
                while (kidenumerator.MoveNext())
                    yield return kidenumerator.Current;
            }
        }

IEnumerator<> is defined in System.Collections.Generic and defines a typesafe iterator, i.e., an iterator whose Current property is strongly typed.

Now, our depth-first traversal first returns the data of the current node itself, then iterates over the children nodes, gets their enumerators, and returns the values visited. For example, if you build up a structure like:

            Node<string> root= new Node<string>("Gramps");
            Node<string> son= new Node<string>("Son");
            Node<string> daughter= new Node<string>("Daughter");
            root.Children.Add(son);
            root.Children.Add(daugther);
            son.Children.Add(new Node<string>("Grandson"));
            son.Children.Add(new Node<string>("Granddaughter"));

a call like:

            foreach (string s in root.GetDepthFirstEnumerator())
                Console.WriteLine(s);

will result in:

            Gramps
            Son
            Grandson
            Granddaughter
            Daugther

First, root.GetDepthFirstEnumerator() yields root's data, i.e., "Gramps"; then it starts iterating over its children, the first one being son. Now, son.GetDepthFirstEnumerator() gets called and yields its data "Son" before starting to iterate over its own children. Both children yield their data "Grandson" and "Granddaughter", but no more because they don't have children themselves. Now, son.GetDepthFirstEnumerator() is done, and root.GetDepthFirstEnumerator() goes on to daughter, which, having no children of its own, yields only its own data "Daughter".

Breadth-first traversal is a little more complicated, but still C# 2.0's iterator facility allows an incredibly easy implementation:

        public IEnumerator<Element> GetBreadthFirstEnumerator()
        {
            Queue<Node<Element>> todo = new Queue<Node<Element>>();
            todo.Enqueue(this);
            while (0 < todo.Count)
            {
                Node<Element> n = todo.Dequeue();
                foreach (Node<Element> kid in n.mChildren)
                    todo.Enqueue(kid);
                yield return n.mData;
            }
        }

Assuming the same node structure as for the depth-first example, the code piece:

            foreach (string s in root.GetBreadthFirstEnumerator())
                Console.WriteLine(s);

produces this output:

              Gramps
              Son
              Daugther
              Grandson
              Granddaughter

In this case, recursion is unrolled into a loop. First, root.GetBreadthFirstEnumerator() creates a queue and puts root inside. Then, getting into the while loop, it unqueues root. Now, it iterates over root's children son and daughter and puts them into the queue. Then, it yields root's data "Gramps". Next time inside the loop, it pops son from the queue, pushes son's children into the queue, and yields son's data "Son". Next time, it pops daughter from the queue which has no children, so it yields daughter's data "Daugther" and proceeds. Now, it pops the two children of son, both of which don't have children of their own, thus do not add anything to the queue. Their data "Grandson" and "Granddaugther" gets yielded, and next time checking the loop-condition, the queue is empty and traversal thus finished.

Conclusion and To-Do

This little class gets you a generic tree container, if you need a quick and simple solution. It shows off the possibilities the new language features of C# 2.0 facilitate. In order to create a really useful, general purpose tree library, though, the code would have to be more adaptable to specific needs of the clients. Here's a short list of ideas - and who knows, maybe my next article will have some or even all of them implemented:

At the moment, Node is implemented for trees with an arbitrary number of children. But very often, you know that each node will have a fixed number of children (binary trees, quad trees etc.). For such scenarios, the presented implementation is clearly inefficient as it makes use internally of a framework container (System.Containers.Generic.List) where it should be using an array. So, it would be nice if nodes were parameterizable with a strategy for keeping children. Ideally, there would be two predefined strategies in the library, one using the list container, another one for arrays of static size, and clients could implement an interface for other cases.
Considering that quite often there is the need to visualize a tree structure, it would make sense to write an adaptor for filling a TreeView from a Node structure.
As for now, both traversal algorithms do not supply clients with any information about where exactly they are in the tree. Now, in game applications, for example, it's usually important to know how deep a depth-first traversal has gone currently. Besides, sometimes they want to parameterize how deep traversal will proceed, or they want to add nodes as they get deeper into the tree.

History

2004-08-10: first release.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here