The Basics of Linked Lists and Binary Trees

Jaken Veina

Rate me:

4.77/5 (19 votes)

7 Aug 200510 min read

129.1K

2.6K

The basics of creating and using linked lists, and their cousin binary trees, from scratch

What Is a Linked List?

A linked list can be thought of like an array, for it does basically the same thing. An array is a convenient way to store and sort many instances of similar data, or in other words, many variables of the same type. A basic linked list is simply a different approach to this idea.

What's the Difference?

Arrays work in a way that each individual element is stored right after the previous, in succession, and the only thing needed to access them is a pointer to the first. In this way, you can access the first element by simply dereferencing that pointer.

C++

char array[5];

*array; //Same as array[0]

Then, since each element is stored in succession, if you want to find the second element, simply look in the next piece of memory.

*(array + 1); //Same as array[1]

And so on. This approach uses very minimal memory, a must for things such as game programming and robotics programming, which often times do not have the luxury of a vast amount of memory, that we computer programmers have.

So, as you might have guessed now, linked lists use up much more, the most complex form of a linked list, a tree, uses about four times as much memory as an array, depending on the type of data that is being stored. If each variable is one byte, it would actually be thirteen times as much, but if you're storing strings, or something similar that can take up very large amounts of memory, the increase in memory usage would not be that significant.

So far, I've only talked about space issues, and if that's all you're looking at, there's no question that arrays are the way to go. But, space is not the only issue. There is also speed. One can think of space and speed as darkness and light. The more light there is, the less darkness there is, and vice versa. Space and speed often relate in the same way.

So, thus, in terms of speed, arrays are the lacking and linked lists win out. Or at least, that may be what you're thinking. But, that statement is not entirely true. There are some instances where it is, but there are also some where it is not. You must understand exactly where the two methods differ in order to know which will be faster in a given situation.. I've been going on for a while now, and you still probably have no clue how to make a linked list or use one, so I'll come back to this subject later.

The Basics - A Singly Linked List

A friend once described a basic linked list as a train and each element being stored as its cars. I thought it was quite a good representation, so I shall do the same.

In a passenger train, each car holds data, the people riding in it, and a link to the next car, a door. An item, or node as it is called, of a singly linked list is the same. It holds data and a link to the next car, which in programming would be a pointer.

Now, let's look at the ticket taker. He starts at the front of the train and moves, one car at a time, to the back, checking each occupant for his or her ticket. In a linked list, the equivalent of the ticket taker would be a pointer. The pointer would start out pointing to the first node of a linked list. You, as a programmer, would dereference that pointer and look at the data in that node, perhaps change it or do something with it. Then, you would access the pointer in that node which points to the next node, and set that value into your ticket taker pointer. Now, your ticket taker pointer points to the second node, and you can again look at the data, and when you're done, you can set the ticket taker to point to the next node. The last item in the list would point to NULL as it's the next node, so you'd know you're done.

Now, this seems a bit complex, I know, but it's all just talking. So, let's put it into some actual code. I'll be using a C++ class for both the nodes and the linked list as a whole. For now, the data in each node will just be a simple int.

C++

//Basic node class. 
class LLNode
 {
  public:
    LLNode() //Default Constructor
     {
      p_data = 0;    //No data.
      p_next = NULL; //New node always points to NULL by default.
     }
    LLNode(unsigned int data) //Constructor
     {
      p_data = data; //Set the data
      p_next = NULL; //New node always points to NULL by default.
     }
    //Retrieve the data of the node. Constant function.
    unsigned int data(void) const 
     {
      return p_data;
     }
    //Retrieve the next point of the node. Constant function.
    LLNode* next(void) const 
     {
      return p_next;
     }
    //Set the data of the node.
    void setdata(unsigned int data) 
     {
      p_data = data;
     }
    //Set the next pointer of the node.
    void setnext(LLNode* next) 
     {
      p_next = next;
     }
  private:
    //Data stored by the node. protected.
    unsigned int p_data; 
    //Pointer to the next node. protected.
    LLNode* p_next;      
 };

C++

//Linked list class. Handles allocation, 
//deallocation, and sorting of nodes.
class LList
 {
  public:
    LList() //Constructor
     {
      p_first = NULL; //List contains no nodes by default.
     }
    ~LList() //Destructor
     {
      //Current node being pointed to.
      LLNode* current = p_first; 
      //Variable for storing the node to be deleted.
      LLNode* deleting;          

      //While current is not NULL, there are 
      //still more items in the list.
      while(current != NULL) 
       {
        //Store the current node to be deleted.
        deleting = current;        
        //Move to the next node.
        current = current->next(); 
        //Delete the stored node.
        delete deleting;           
       }
     }
    //Retrieves the first node in the list. 
    //Constant function.
    LLNode* first(void) const 
     {
      return p_first;
     }
    //Add an item to the end of the list.
    void additem(unsigned int data) 
     {
      //Make sure there are nodes in the list.
      if(!p_first)                  
        //If there aren't make the first node.
        p_first = new LLNode(data); 
      else             //If there are items...
       {
        LLNode* current = p_first;        //Stores the current node.
        while(current->next() != NULL)    //While there is a next node.
          current = current->next();      //Move to the next node.
        LLNode* prev = current;           //Store the current node.
        current = new LLNode(data);       //Make a new node.
        prev->setnext(current);           //Set the stored node to 
                                          //point to the new one.
       }
     }
    //Delete an item from the list, based on its data.
    bool deleteitem(unsigned int data) 
     {
      if(p_first == NULL)                 //Make sure there are items in the list.
        return 0;                         //If there aren't end the function now.

      LLNode* current = p_first; //Start with the first node.

      //If the first node is the one we want...
      if(current->data() == data) 
       {
        p_first = current->next();        //Set p_first to NULL.
        delete current;                   //Delete the node.
        return 1;                         //Success.
       }
      
      //While there is a next node and it is not the one we want.
      while(current->next() != NULL && 
            current->next()->data() != data)
        current = current->next();        //Move to the next node.

      if(current->next() != NULL)         //If the node was found.
       {
        //Store the current node.
        LLNode* prev = current;         
        //Move to the node to be deleted.
        current = current->next();      
        //Remove links to the node.
        prev->setnext(current->next()); 
        //Delete the node.
        delete current;                 
        //Success.
        return 1;                       
       }
      return 0; //Failure.
     }
    //Search the list for a node with specified data.
    LLNode* search(unsigned int data) 
     {
      //Set the current node to the first.
      LLNode* current = p_first; 
      //While current is not NULL, there are still more nodes.
      while(current != NULL)     
       {
        //If we have found the node, with the correct data...
        if(current->data() == data) 
          return current;     //return it.
        //Otherwise, move to the next node and continue.
        current = current->next();  
       }
      return NULL; //If we never found the node, return NULL.
     }
    //Insert a new node after a node with specified data.
    bool insertitem(unsigned int prevdata, unsigned int data) 
     {
      //Find the node with the specified data.
      LLNode* prev = search(prevdata); 
      //If the node was not found...
      if(prev == NULL)                 
        return 0;    //We did not succeed.

      //Store the next node.
      LLNode* next = prev->next();        
      //Make the new node.
      LLNode* current = new LLNode(data); 
      //Set the previous node to point to the new one.
      prev->setnext(current);             
      //Set the new node to point to the next one.
      current->setnext(next);             

      return 1; //We did succeed.
     }
    //Insert a new node as the first node.
    void insertfirst(unsigned int data) 
     {
      //Create the new node.
      LLNode* current = new LLNode(data); 
      //Make the current node point to the old first node.
      current->setnext(p_first);          
      //Set p_first to the current node.
      p_first = current;                  
     }
  private:
    LLNode* p_first; //First node in the list. 
 };

There is a lot to take in, yes, but, with the comments, you should be able to understand it, pretty well. In your own linked list implementations, you can add in other functions you find necessary, or remove those that you feel are not. It's up to you.

Doubly Linked lists

Now, one thing you might have found about the previous, a singly linked list, is that when you are traversing it, you may only move in one direction. If your purposes serve, you may take this list and upgrade it to a doubly linked list. There are just a few things you must add to the node class to do this. Mostly, this is just a pointer to the previous node, in addition to the pointer to the next node, but you must make sure that you give this new pointer the correct address value, therefore you must edit both the node's class and the list's class. I've laid out the only node's class below, simply for space issues. The full code for both classes can be found in the source, provided with this article.

C++

//Basic Doubly Linked List node class. 
class DLLNode
 {
  public:
    DLLNode() //Default Constructor
     {
      p_data = 0;    //No data.
      //New node always has a NULL next node.
      p_next = NULL; 
      //New node always has a NULL previous node.
      p_prev = NULL; 
     }
    DLLNode(unsigned int data) //Constructor
     {
      p_data = data;           //No data.
      //New node always has a NULL next node.
      p_next = NULL; 
      //New node always has a NULL previous node.
      p_prev = NULL; 
     }
    //Retrieve the data of the node. Constant function.
    unsigned int data(void) const 
     {
      return p_data;
     }
    //Retrieve the previous pointer of the node. 
    //Constant function.
    DLLNode* prev(void) const 
     {
      return p_prev;
     }
    //Retrieve the previous pointer of the node. 
    //Constant function.
    DLLNode* next(void) const 
     {
      return p_next;
     }
    //Set the data of the node. 
    void setdata(unsigned int data) 
     {
      p_data = data;
     }
    //Set the previous pointer of the node.
    void setprev(DLLNode* prev) 
     {
      p_prev = prev;
     }
    //Set the next pointer of the node.
    void setnext(DLLNode* next) 
     {
      p_next = next;
     }
  private:
    //Data stored by the node. protected.
    unsigned int p_data; 
    //Pointer to the next node. protected.
    DLLNode* p_prev;     
    //Pointer to the previous node. protected.
    DLLNode* p_next;     
 };

Arrays and Linked Lists

So, back to the topic of the differences between the two. I've already discussed the space issues, in depth, so now I will go into speed.

Arrays are actually very fast, both from a programming aspect and from an executing aspect, but this is only in certain areas. Accessing and changing the values of array elements is extremely faster than with linked lists, since all you need to do is dereference a memory address. And finding a specific item in the list is easy as well. You can go directly to it through simple pointer arithmetic, unlike linked lists which must cycle through each node until they get to the desired one. Arrays fall very short, though, when it comes to sorting, re-sorting, inserting, and deleting elements. To sort or re-sort, you must move each individual element in memory, and there is always the aspect that they must be consecutively stored, that you must keep in mind. This applies to inserting and deleting, for in either one, you must move each individual element that follows the inserted or deleted one. This can be very time consuming, in both the coding stage and in execution.

Linked lists require much more code to perform simple operations, even though the code is simple to understand. But it is definitely more work than simply saying array[5], or whatever. However, where arrays are lacking, linked lists soar. Inserting and deleting an item in a list requires two simple changes to the rest of the list, unlike arrays that must change the entire remaining portion. The same goes for moving an item. It requires two actions to remove it, and two actions to put it back. Arrays, must do the 'change the entire list' thing twice, just to move one item.

For most purposes, arrays are quite satisfactory. Linked lists, like binary trees, which you will learn about in a moment, are mostly used when sorting the items is necessary.

Binary Trees

And now for the biggie. Binary trees are one of the most efficient data structures for sorting data. I'll lay out a binary tree for you, visually:

      []
    /    \
  []      []
 /  \    /  \
[]  []  []  []

Each set of brackets represents a node, and as you can tell from the middle row of nodes, each one is linked to two different nodes, a left node and a right node. As I said, the main use of binary trees is for sorting. Let's say I have a linked list with a single node, storing the value of five.

[5]

Now, I have a new number, entered by the user, that I need to sort. The number is three. Since that number is less than the value of my first node, I move to the left. Of course, there is no left node, so I simply add in a new one, containing the number three.

     [5]
   /
[3]

If I get another number that is greater than the first node, say eight, I would go to the right. Again, there is no right node, so I go ahead and add in a new one.

     [5]
   /     \
[3]       [8]

Next up I have a six. It is greater than five, so I move to the right. But this time, there is a node to the right, so I do the same checks on it. Six is less than eight, so I move left. Since there is no node there, I add it. (Get the pattern yet?)

     [5]
   /     \
[3]       [8]
         /
       [6]

From here, I'll just skip the text. If you haven't picked it up yet, you will.

     [5]
   /     \
[3]       [8]
   \     /
   [4] [6]

        [5]
      /     \
   [3]       [8]
  /   \     /
[2]   [4] [6]

        [5]
      /     \
   [3]       [8]
  /   \     /
[2]   [4] [6]
            \
            [7]

        [5]
      /     \
   [3]       [8]
  /   \     /   \
[2]   [4] [6]   [9]
            \
            [7]

          [5]
        /     \
     [3]       [8]
    /   \     /   \
  [2]   [4] [6]   [9]
  /           \
[1]           [7]

And now, we have a complete binary tree, that sorts the numbers one through nine. As you might have noticed, the actual structure of the tree depends upon how the numbers are entered. You could also have had...

     [3]
   /     \
[1]       [7]
   \     /   \
   [2] [5]   [8]
       / \     \
     [4] [6]   [9]

or...

      [6]
    /     \
 [1]       [7]
    \         \
    [4]       [9]
    / \       /
  [3] [5]   [8]
  /
[2]

or, you might have even had a balanced tree...

        [5]
      /     \
   [2]       [8]
  /   \     /   \
[1]   [4] [6]   [9]
      /     \
    [3]     [7]

Although, technically, a "perfectly" balanced tree would also have a zero and a ten on the bottom corners. A balanced tree is just the term used for a tree in which each and every path is of the same length.

As I said, binary trees are a very efficient and fast way of sorting. However, this depends upon the structure of the tree. If you have a very unbalanced tree, it will take longer to sort something on the larger side, than on the smaller side. It's important to know this when using binary trees. And, as with linked lists, you can, if you desire, make yourself a doubly linked binary tree, where each node also points to the previous node. In fact, it would probably be preferable, for a link to the previous node makes outputting the sorted data, in a sorted order, much easier. The example I have provided incorporates this. But, it all depends on how you plan to use it.

When it comes to coding, there are only two things that you need to know how to do. Traversing the list and searching it. Both involve the idea of recursion, in which a function calls itself. I'll use the search function as an example, but again, the full code is available for download.

C++

//Private, recursive, function for searching.
BTNode* p_search(unsigned int item, BTNode* node) 
 {
  if(node != NULL) //If the node is not NULL...
   {
    //If we have found the node...
    if(item == node->data()) 
      return node;    //Return it.
    //If the node's data is greater than the search item...
    if(item < node->data())  
      //Search the left node.
      return p_search(item, node->left());  
    //If the node's data is less than the search item...
    else                     
      //Search the right node.
      return p_search(item, node->right()); 
   }
  else
    //If the node is NULL, return NULL.
    return NULL; 
 }

With the comments, it should be self-explanatory. The code for traversing the list is slightly more complex, since you must be sure to traverse each and every node, rather than just those that exist down the path that you need to travel to find the node. To add a node, you would really just use the search function, but when a node is null, rather than failure, it means you have found the point where the node belongs, and you then insert it. Again, all the code is zipped for download.

History

August 7^th, 2005 - Version 1.0

Thanks

Maximilien of CodeProject.com - pointed out another plus of arrays
AgeKay of CodeProject.com - pointed out a small, but dangerous typo error in the code

License

This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below. A list of licenses authors might use can be found here.

Written By

Jaken Veina

Web Developer

United States

This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.