Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#4.0

Balanced Binary Search Tree (BST) (Search, Delete, InOrder, PreOrder, PostOrder,DepthFirst, BreadthFirst, BalanceTree)

5.00/5 (14 votes)
9 Jun 2012CPOL5 min read 1   2.5K  
Balanced Binary Search Tree (BST) (Search, Delete, PrintInOrder, PrintPreOrder, PrintPostOrder,DepthFirst, BreadthFirst, BalanceTree)

Introduction

From Wikipedia:

In computer science, a binary search tree (BST) is a node-based binary tree data structure which has the following properties:[1]

  • The left subtree of a node contains only nodes with keys less than the node's key.
  • The right subtree of a node contains only nodes with keys greater than the node's key.
  • Both the left and right subtrees must also be binary search trees.

From the above properties, it naturally follows that:

  • Each node (item in the tree) has a distinct key:

I have always been fascinated with the Binary Tree data structure. It’s nice to have a storage structure that supports O (log n) search time. I've decided to create my own binary tree data structure with a few added features like the ability to rebalance the tree whenever the user desires.

Background

First let me explain what O (log n) means when it comes to the Binary Search Trees.

Let's start out with O (n):

Let’s use an array for this example. Let’s suppose I have an array of 10 items. If I wanted to search for a specific element, there is a chance that I may have to search all the elements in the array. This situation will occur if the element that I’m looking for is located in the last cell of the array. Now let n be equal to the length of the array (here n=10). That means I will have to search up to 10 items. Therefore the search time for an array is O (n), where n is the length of the array.

Now O (Log n):

Now let’s say that I use a binary tree to store the 10 items instead of an array. Because the binary tree will store the elements in sorted order if I’m looking for a specific item, I don't have to search the entire binary tree. I can just check if the current element is greater or less than the element that I’m looking for. If the current element is greater than the element I’m looking for, I search its left tree. If the current element is less than the element I’m looking for, I search its right tree. The number of comparisons I make is really the depth of the tree itself. So O (log n) is the actual depth of the tree (the number of nodes from the root to the deepest element). This is the number of comparisons I need to make to find an element in the average case.

Why is O (log n) better than O (n)?

Again let’s say that we have 8 items that we place into an array and a binary tree.

In the array, my search time will be O (8) = 8. In the binary tree, my search time will be O (log 8) = 3. We can see that 3 < 8 here. So the binary tree would be a better choice.

Even as the number of elements increase to 64 (n = 64), I can see that O (log n) is faster because O (log 64) = 6 whereas O (64) is 64. It’s important to remember that in the binary tree, I’m only searching up the depth of the tree.

There can also be situations where a binary tree becomes degenerate (Google it). In this case, the average search time for both the array and binary tree will be O (n). My Balance() function will take care of the degenerate tree and restore a O (log n) search time.

Implementation

I've created some simple procedures to visit the nodes in the binary tree in different orders. These functions are InOrder(), PreOrder(), PostOrder(), DepthFirst(), BreadthFirst(). These terms should be familiar to knowledgeable developers. They all used named iterators to return each node. Currently I just print out the value of the node to the console. You can modify it to your liking.

The Balance() Function Explanation

As items get added to the binary tree, the structure can degenerate and cause a O (n) search time in some cases. The Balance() function will rebalance the tree. The method I used to rebalance the tree is common. I've actually seen a few articles on the internet explaining it. I first copy the reference pointers of the elements in the binary tree into a list. I actually copy them InOrder. This way all items in the list are in sorted order from smallest to largest. I then find a pivot index like this:

C#
int middleNode = (int)Math.Ceiling(((double)min + max) / 2); 

From the above line, I get the middle element of the list. This will be my root node in the new balanced tree. I then recursively apply the same technique to the left half of the list and then to the right half. Every time I find the ‘root’ node, I add it to my new balanced tree. At the end of the recursion, I am left with a nicely balanced tree.

Here is the complete code for the binary search tree:

C#
[Serializable]
public class BinaryTree<T> where T : IComparable<T>
{
    private BinaryTreeNode<T> root;
    public int Count;
    
    public void Insert(T value)
    {
        BinaryTreeNode<T> node = new BinaryTreeNode<T>(value);
        
        Insert(node);
    }
    
    public void Insert(BinaryTreeNode<T> node)
    {
        if (node == null)
        {
            throw new ArgumentNullException("node","node cannot be null");
        }
        
        if (root == null)
        {
            root = node;
        }
        else
        {
            Insert(node, ref root);
        }
        
        Count++;
    }
    
    public void Delete(T value, bool rebalanceTree)
    {
        BinaryTreeNode<T> parentNode;
        BinaryTreeNode<T> foundNode = null;
        BinaryTreeNode<T> tree = parentNode = root;
        
        //Search for the node while keeping a reference to the parent
        while (tree != null)
        {
            if (value.CompareTo(tree.Data) == 0)
            {
                foundNode = tree;
                break;
            }
            else if (value.CompareTo(tree.Data) < 0)
            {
                parentNode = tree;
                tree = tree.Left;
            }
            else if (value.CompareTo(tree.Data) > 0)
            {
                parentNode = tree;
                tree = tree.Right;
            }
        }
        
        if (foundNode == null)
        {
            throw new Exception("Node not found in binary tree");
        }
        
        bool leftOrRightNode = false;
        
        if (foundNode != parentNode.Left)
        {
            leftOrRightNode = true;
        }
        
        if (foundNode == parentNode) //oh oh. We're trying to delete the root here.
        {
            if (rebalanceTree)
            {
                //Let's just remove the parent and rebalance the tree 
                //to ensure our tree will be nice and balanced
                //after the removal
                IList<BinaryTreeNode<T><binarytreenode<t>> listOfNodes = 
				new List<BinaryTreeNode<T><binarytreenode<t>>();
                 
                FillListInOrder(root, listOfNodes);
                RemoveChildren(listOfNodes);
                listOfNodes.Remove(parentNode);
                
                root = null;
                int count = Count - 1;
                Count = 0;
                
                BalanceTree(0, count - 1, listOfNodes);
            }
            else
            {
                //Regular way without balancing: just find the left-most node on the right
                //side of the tree and that will become the root
                BinaryTreeNode<T> leftMost = FindLeftMost(parentNode.Right, true);
                
                if (leftMost != null)
                {
                    leftMost.Left = parentNode.Left;
                    leftMost.Right = parentNode.Right;
                    root = leftMost;
                }
            }
        }
        else if (foundNode.Left == null && foundNode.Right == null) //This is a leaf node
        {
            //Just set it to null
            if (leftOrRightNode)
            {
                parentNode.Right = null;
            }
            else
            {
                parentNode.Left = null;
            }
        }
        else if (foundNode.Left != null && 
		foundNode.Right != null) //This is a node with two children
        {
            if (leftOrRightNode)
            {
                parentNode.Right = foundNode.Right;
                parentNode.Right.Left = foundNode.Left;
            }
            else
            {
                parentNode.Left = foundNode.Right;
                parentNode.Left.Left = foundNode.Left;
            }
        }
        
        else if (foundNode.Left != null || 
		foundNode.Right != null) //This is a node with a single child
        {
            if (foundNode.Left != null)
            {
                if (leftOrRightNode)
                {
                    parentNode.Right = foundNode.Left;
                }
                else
                {
                    parentNode.Left = foundNode.Left;
                }
            }
            else
            {
                if (leftOrRightNode)
                {
                    parentNode.Right = foundNode.Right;
                }
                else
                {
                    parentNode.Left = foundNode.Right;
                }
            }
        }
    }
    
    public BinaryTreeNode<T> Search(T value)
    {
        BinaryTreeNode<T> tree = root;
        
        while (root != null)
        {
            if (value.CompareTo(tree.Data) == 0)
            {
                return tree;
            }
            else if (value.CompareTo(tree.Data) < 0)
            {
                tree = tree.Left;
            }
            else if (value.CompareTo(tree.Data) > 0)
            {
                tree = tree.Right;
            }
        }
        
        return null;
    }
    
    public IEnumerable<BinaryTreeNode<T>> InOrder()
    {
        return InOrder(root);
    }
    
    private IEnumerable<BinaryTreeNode<T>> InOrder(BinaryTreeNode<T> node)
    {
        if (node != null)
        {
            foreach(BinaryTreeNode<T> left in InOrder(node.Left))
            {
                yield return left;
            }
            
            yield return node;
            
            foreach (BinaryTreeNode<T> right in InOrder(node.Right))
            {
                yield return right;
            }
        }
    }
    
    public IEnumerable<BinaryTreeNode<T>> PreOrder()
    {
        return PreOrder(root);
    }
    
    public IEnumerable<BinaryTreeNode<T>> PostOrder()
    {
       return  PostOrder(root);
    }
    
    public IEnumerable<BinaryTreeNode<T>> BreadthFirstTraversal()
    {
        Queue<BinaryTreeNode<T>> queue = new Queue<BinaryTreeNode<T>>();
        
        queue.Enqueue(root);
        
        while (queue.Count != 0)
        {
            BinaryTreeNode<T> current = queue.Dequeue();
            
            if (current != null)
            {
                queue.Enqueue(current.Left);
                queue.Enqueue(current.Right);
                
                yield return current;
            }
        }
    }
    
    public IEnumerable<BinaryTreeNode<T>> DepthFirstTraversal()
    {
        Stack<BinaryTreeNode<T>> queue = new Stack<BinaryTreeNode<T>>();
        
        BinaryTreeNode<T> current;
        
        queue.Push(root);
        
        while (queue.Count != 0)
        {
            current = queue.Pop();
            
            if (current != null)
            {
                queue.Push(current.Right);
                queue.Push(current.Left);
                
                yield return current;
            }
        }
    }
    
    public void BalanceTree()
    {
        IList<BinaryTreeNode<T>> listOfNodes = new List<BinaryTreeNode<T>>();
        
        FillListInOrder(root, listOfNodes);
        RemoveChildren(listOfNodes);
        
        root = null;
        int count = Count;
        Count = 0;
        
        BalanceTree(0, count - 1, listOfNodes);
    }
    
    private void Insert(BinaryTreeNode<T> node, ref BinaryTreeNode<T> parent)
    {
        if (parent == null)
        {
            parent = node;
        }
        else
        {
            if (node.Data.CompareTo(parent.Data) < 0)
            {
                Insert(node, ref parent.Left);
            }
            else if (node.Data.CompareTo(parent.Data) > 0)
            {
                Insert(node, ref  parent.Right);
            }
            else if (node.Data.CompareTo(parent.Data) == 0)
            {
                throw new ArgumentException("Duplicate node");
            }
        }
    }
    
    private void BalanceTree(int min, int max, 
	IList<BinaryTreeNode<T>><binarytreenode<t> list)
    {
        if (min <= max)
        {
            int middleNode = (int)Math.Ceiling(((double)min + max) / 2);
            
            Insert(list[middleNode]);
            
            BalanceTree(min, middleNode - 1, list);
            
            BalanceTree(middleNode + 1, max, list);
        }            
    }
    
    private void FillListInOrder(BinaryTreeNode<T> node, 
	ICollection<BinaryTreeNode<T>><binarytreenode<t> list)
    {
        if (node != null)
        {
            FillListInOrder(node.Left, list);
            
            list.Add(node);
            
            FillListInOrder(node.Right, list);
        }
    }
    
    private void RemoveChildren(IEnumerable<BinaryTreeNode<T><binarytreenode<t>> list)
    {
       foreach(BinaryTreeNode<T> node in list)
       {
           node.Left = null;
           node.Right = null;
       }
    }
    
    private IEnumerable<BinaryTreeNode<T>> PreOrder(BinaryTreeNode<T> node)
    {
        if (node != null)
        {
            yield return node;
            
            foreach (BinaryTreeNode<T><t> left in PreOrder(node.Left))
            {
                yield return left;
            }
            
            foreach (BinaryTreeNode<T> right in PreOrder(node.Right))
            {
                yield return right;
            }
        }
    }
    
    private IEnumerable<BinaryTreeNode<T>> PostOrder(BinaryTreeNode<T> node)
    {
        if (node != null)
        {
            foreach (BinaryTreeNode<T> left in PostOrder(node.Left))
            {
                yield return left;
            }
            
            foreach (BinaryTreeNode<T> right  in PostOrder(node.Right))
            {
                yield return right;
            }
            
            yield return node;
        }
    }
    
    private BinaryTreeNode<T> FindLeftMost(BinaryTreeNode<T> node, bool setParentToNull)
    {
        BinaryTreeNode<T> leftMost = null;
        BinaryTreeNode<T> current = leftMost = node;
        BinaryTreeNode<T> parent = null;
        
        while (current != null)
        {
            if (current.Left!=null)
            {
                parent = current;
                leftMost = current.Left;
            }
        
            current = current.Left;
        }
        
        if (parent!=null && setParentToNull)
        {
            parent.Left = null;
        }
        
        return leftMost;
    }    
}

[Serializable]
public class BinaryTreeNode<T> where T : IComparable<T>
{
    public BinaryTreeNode(T value)
    {
        Data = value;
    }
    
    public T Data
    {
        get;
        set;
    }
    
    public BinaryTreeNode<T> Left;
    
    public BinaryTreeNode<T> Right;
}

Comments

The way I implemented the Delete() method when it comes to deleting the root node is interesting. Not only do I remove the root but I also rebalance the tree. I was looking at some examples where when I remove the root and not rebalance the tree, we would not get a good binary distribution and optimal tree. That being said, the binary tree (not the elements) is re-created in this case and does cause a small performance hit to re-insert them into the binary tree. The Delete() method can also be called to delete the root node and not rebalance the tree. There is virtually no performance hit here other than the O (log n) to find the left-most node on the right sub-tree.

Very Important Points

Remember that this binary tree uses recursion to insert items and retrieve items (InOrder, PreOrder, PostOrder).  That means that if there are many levels in the tree and therefore many nodes your execution stack may become overwhelmed.  The thread execution stack has a limit of roughly 1 MB so many recursive calls will result in a StackOverflowException which cannot be caught and handled.  In cases where you plan to store a huge amount of items in the tree please convert the recursive method to iterative ones.  For example I converted the recursive insert to an interative version below:

C#
private void Insert(BinaryTreeNode<t> node, BinaryTreeNode<t> parent)
{
     BinaryTreeNode<t> tree = parent;
     bool leftChild = false;

     while (tree != null)
     {
       if (node.Data.CompareTo(tree.Data) < 0)
       {
          parent = tree;
          tree = tree.Left;
          leftChild = true;
       }
       else if (node.Data.CompareTo(tree.Data) > 0)
       {
          parent = tree;
          tree = tree.Right;
          leftChild = false;
       }
       else
       {
           throw new ArgumentException("Duplicate node");
       }
      }

      if (leftChild)
      {
         parent.Left = node;
      }
      else
      {
        parent.Right = node;
      }
 }

Please tell me what you think!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)