Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#3.5

A Starting Point for Searching and Sorting Algorithms

4.81/5 (6 votes)
1 Jun 2011CPOL7 min read 27.8K  
The basics of searching and sorting algorithms via C#.

Preface

This article will take a basic look at searching and sorting algorithms. The advanced developer should overlook this content. As the written example code will show, an algorithm contains a lot of nested functionality. For instance, assume you have a set of data. You perform an operation on each element. The output of that operation is then fed into a for loop, where each element is stepped over to then be fed into another for loop, wherein the output data then is passed on as input to undergo another arithmetic/logical operation. Having said that, the order of the article will be as follows:

  • Linear search algorithm
  • Binary search algorithm
  • Merge sort algorithm
  • Insertion sort algorithm
  • Selection sort algorithm
  • QuickSort algorithm

An algorithm instructor on a YouTube MIT Open Course Ware lecture stated that a programmer's success can easily depend on his or her ability to understand and write algorithms. His focus was on using C and Java. We're going to use .NET managed code. There are many algorithms that search through and/or sort data. Searching data involves determining whether a value is present in the data, and if so, finding the value's location. Searching algorithms normally involve either a linear or a more complicated binary search. The linear search algorithm, as its name implies, searches each element in an array sequentially. If the search key does not match an element in the array, the algorithm tests each element and, when the end of the array is reached, informs the user that the search key (the value sought) is not present.

Searching algorithms

So how would we write a linear algorithm? The code below has a class that contains a private instance variable called values, and a static Random object named rd to fill the array with randomly generated integers. We create a space for the array, and fill the array with random integers ranging in value from 10 -99. The Search method performs a linear search on the elements. The user should enter an integer and have that method locate it (if it is contained in the array) and output its location:

C#
using System;
public class SearchLinear
{
    private Int32 [] values;
    private static Random rd = new Random();

    public SearchLinear(Int32 size)
    {
        values = new Int32[size];
        for (Int32 i = 0; i < size; ++i)
            values[ i ] = rd.Next(10, 100);
    }

    public Int32 Search(Int32 key)
    {
        for ( Int32 j = 0; j < values.Length; ++j)
            if ( values[ j ] == key)
                return j;
        return -1;
    }

    public override string ToString()
    {
        string temp = string.Empty;
    
        foreach (Int32 k in values)
            = k + " ";
        temp += "\n";
        return temp;
    }
}

public class Program 
{
    public static void Main()
    {
        Int32 s;
        Int32 p;
        SearchLinear sl = new SearchLinear(10);
        Console.WriteLine(sl);

        Console.Write( "Enter an integer value (-1 to quit): " );
        s = Convert.ToInt32(Console.ReadLine() );

        while ( s != -1 )
        {
            p = sl.Search(s);
            if ( p != -1 )
                Console.WriteLine("Integer {0} was found in place {1}.\n", s, p);
            else
                Console.WriteLine("Integer {0} was not found.\n", s);
    
            Console.Write("Enter an integer value(-1 to quit): ");
            s = Convert.ToInt32(Console.ReadLine());
        }
    }
}

Capture.JPG

The linear search example was simplified to exemplify the concept of searching the elements sequentially to find the value and specify its location. The Binary search algorithm is a more powerful technique for locating a particular value in a sorted list. The method makes progressively better guesses, and closes in on the location of the sought value by selecting the middle element in the span which, because the list is in sorted order, is the median value, comparing its value to the target value, and determining if it is greater than, less than, or equal to the target value. A guessed index whose value turns out to be too high becomes the new upper bound of the span, and if its value is too low, that index becomes the new lower bound:

C#
using System;

public class BinarySearch
{
    private Int32 []  values;
    private static Random rd = new Random();

    public BinarySearch(Int32 size)
    {
        values = new Int32 [ size ];
    
        for (Int32 i = 0; i < size; ++i)
            values[ i ] = rd.Next(10, 100);
        Array.Sort( values );
    }

    public Int32 Search(Int32 element)
    {
        Int32 bottom = 0;
        Int32 top = values.Length - 1;
        Int32 center = (bottom + top + 1) / 2;
        Int32 area = -1;

        do
        {
            Console.Write(Remainders(bottom, top) );

            for (Int32 i = 0; i < center; ++i)
                Console.WriteLine(" * ");
            if ( element == values[center] )
                area = center;
            else if (element < values[center])
                top = center - 1;
            else
                bottom = center + 1;
                center = (bottom + top + 1) /2;
        } while ( ( bottom <= top) && ( area == -1));
        return area;
    }

    public string Remainders(Int32 bottom, Int32 top)
    {
        string temp = string.Empty;
        for (Int32 i = 0; i < bottom; ++i)
            temp += " ";

        for (Int32 i = bottom; i <= top; ++i)
            temp += values[ i ] + " ";
        temp += "\n";
        return temp; 
    }

    public override string ToString()
    {
        return Remainders( 0, values.Length - 1);
    }
}

public class Application
{
    public static void Main() 
    {
        Int32 s;
        Int32 p;
        BinarySearch bs = new BinarySearch(15);
        Console.WriteLine(bs);

        Console.Write("Enter an integer(-1 to quit): " );
        s = Convert.ToInt32(Console.ReadLine());
        Console.WriteLine();

        while (s != -1)
        {
            p = bs.Search(s);
            if ( p != -1 )
                Console.WriteLine("Integer {0} was found in place {1}.\n", s, p);
            else
                Console.WriteLine("Integer {0} was not found.\n", s);

            Console.Write("Enter an integer value(-1 to quit): ");
            s = Convert.ToInt32(Console.ReadLine());
        }
    }
}

This algorithm, as previously stated, starts its search with the middle element. This is the result from a binary search after searching through a short string of randomly generated numbers:

11 29 49 51 52 55 65 71 77 83 84 85 85 87 90

Enter an integer(-1 to quit): 55

11 29 49 51 52 55 65 71 77 83 84 85 85 87 90
 *
 *
 *
 *
 *
 *
 *
11 29 49 51 52 55 65
 *
 *
 *
    52 55 65
 *
 *
 *
 *
 *
Integer 55 was found in place 5.

Sorting algorithms

The Merge Sort algorithm is more efficient than the Insertion Sort and the Selection Sort algorithms. Ever heard of the "divide-and-conquer" strategy"? Merge Sort is based on a divide-and-conquer strategy. First the data is sorted into two halves. Next, each half is sorted independently and may not be sorted recursively. Then the two sorted halves are merged together to form the complete sorted sequence:

C#
using System;
public class Program
{
    public static void MergeSort(ref int[] x, int left, int right)
    {
        if (left < right)
        {
            int middle = (left + right) / 2;
            MergeSort(ref x, left, middle);
            MergeSort(ref x, middle + 1, right);
            Merge(ref x, left, middle, middle + 1, right);
        }
    }

    public static void Merge(ref int[] x, int left, int middle, int middle1, int right)
    {
        int oldPosition = left;
        int size = right - left + 1;
        int[] temp = new int[size];
        int i = 0;

        while (left <= middle && middle1 <= right)
        {
            if (x[left] <= x[middle1])
                temp[i++] = x[left++];
            else
                temp[i++] = x[middle1++];
        }
        if (left > middle)
            for (int j = middle1; j <= right; j++)
                temp[i++] = x[middle1++];
            else
                for (int j = left; j <= middle; j++)
                    temp[i++] = x[left++];
            Array.Copy(temp, 0, x, oldPosition, size);
    }  

    private static void ScrambleData(ref int[] x, Random rdn)
    {
        for (int i = 0; i <= x.Length - 1; i++)
        {
            x[i] = (int)(rdn.NextDouble() * x.Length);
        }
    }

    public static void PrintData(ref int[] xArray, char status, string sortname)
    {
        if (status == 'a')
            Console.WriteLine("After sorting using algorithm: " + sortname);
        else
            Console.WriteLine("Before sorting");
        for (int i = 0; i <= xArray.Length - 1; i++)
        {
            if ((i != 0) && (i % 10 == 0))
                Console.Write("\n");
            Console.Write(xArray[i] + " ");
        }
        Console.ReadLine();
    }

    public static void Main()
    {
        const int items = 50;
        Random rd = new Random(items);
        Int32[] values = new Int32[items];
        ScrambleData(ref values, rd);
        PrintData(ref values, 'b', "");
        Console.WriteLine();
        MergeSort(ref values, 0, values.Length - 1);
        PrintData(ref values, 'a', "MergeSort");
        Console.WriteLine("\n\n");
    }
}

Capture1.JPG

Insertion Sort is another simple, but "inefficient", sorting algorithm. Its first iteration takes the second element in the array and, if it's less than the first, swaps them. The second iteration looks at the third element and inserts it in the correct position with respect to the first two elements, so all three elements are in order. At the ith iteration of this algorithm, the first i elements in the original array will be sorted:

34    56    4    10    77    51    93    30    5    52

An application that implements the insertion sort algorithm first looks at the first two elements of the array, 34 and 56. These are already in order, so the application continues (if they were out of order, it would swap them). In the next iteration, the application looks at the third value, 4. This value is less than 56, so the application stores 4 in a temporary variable and moves 56 one element to the right. The application then checks and determines that 4 is less than 34, so it moves 34 one element to the right. The application has now reached the beginning of the array, so it places 4 in the zeroth position. The array now is:

4    34    56    10    77    51    93    30    5    52

My guess is that somehow the reader might just think that this algorithm would be really powerful if the data is already in order! But that rarely is the case (particularly when random numbers are generated for demonstration purposes). But like Selection Sort, there are many benefits to these algorithms; they also serve as a baseline for the more professional ones.

C#
using System;
public class ISort
{
    private Int32[] values;
    private static Random rd = new Random();

    public ISort(Int32 size)
    {
        values = new Int32[ size ];
        for (Int32 i = 0; i < size; ++i)
            values[ i ] = rd.Next(10, 100);
    }

    public void Sort()
    {
        Int32 input;
        for (Int32 n = 1; n < values.Length; n++)
        {
            input = values[ n ];
            Int32 m = n;
            while ( m > 0 && values[m - 1] > input )
            {
                values[ m ] = values[m - 1];
                m--;
            }
            values[ m ] = input;
            PrintData(n, m);
        }
    }

    public void PrintData(Int32 p, Int32 x)
    {
        Console.Write( "after pass {0}: ", p);
        for (Int32 i = 0; i < x; ++i)
            Console.Write( values[ i ] + " " );
        Console.Write( values[x] + "* " );
        for (Int32 i = x + 1; i < values.Length; ++i)
            Console.Write( values[ i ] + " " );
        Console.Write( "\n              ");

        for (Int32 i = 0; i <= p; ++i)
            Console.Write("--");
        Console.WriteLine("\n");
    }

    public override string ToString()
    {
        string temp = string.Empty;

        foreach (Int32 element in values)
            temp += element + " ";
        temp += "\n";
        return temp;
    }
}

public class Program {
    public static void Main() {
        ISort ins = new ISort(20);
        Console.WriteLine( "Unsorted Array:");
        Console.WriteLine( ins);
        ins.Sort();
        Console.WriteLine("Sorted array: ");
        Console.WriteLine(ins);
    }
}

Notice that when we instantiate the ISort class and create an object, we pass a value of 0 in to signify that the body members of this class are to generate 20 random numbers. Recall, now, that these integers are unsorted. The output might reveal why this is considered an inefficient algorithm:

Unsorted Array:
60 95 68 39 59 18 28 98 27 89 43 89 76 88 29 96 52 78 36 37

after pass 1: 60 95* 68 39 59 18 28 98 27 89 43 89 76 88 29 96 52 78 36 37
              ----

after pass 2: 60 68* 95 39 59 18 28 98 27 89 43 89 76 88 29 96 52 78 36 37
              ------

after pass 3: 39* 60 68 95 59 18 28 98 27 89 43 89 76 88 29 96 52 78 36 37
              --------

after pass 4: 39 59* 60 68 95 18 28 98 27 89 43 89 76 88 29 96 52 78 36 37
              ----------

after pass 5: 18* 39 59 60 68 95 28 98 27 89 43 89 76 88 29 96 52 78 36 37
              ------------

after pass 6: 18 28* 39 59 60 68 95 98 27 89 43 89 76 88 29 96 52 78 36 37
              --------------

after pass 7: 18 28 39 59 60 68 95 98* 27 89 43 89 76 88 29 96 52 78 36 37
              ----------------

after pass 8: 18 27* 28 39 59 60 68 95 98 89 43 89 76 88 29 96 52 78 36 37
              ------------------

after pass 9: 18 27 28 39 59 60 68 89* 95 98 43 89 76 88 29 96 52 78 36 37
              --------------------

after pass 10: 18 27 28 39 43* 59 60 68 89 95 98 89 76 88 29 96 52 78 36 37
              ----------------------

after pass 11: 18 27 28 39 43 59 60 68 89 89* 95 98 76 88 29 96 52 78 36 37
              ------------------------

after pass 12: 18 27 28 39 43 59 60 68 76* 89 89 95 98 88 29 96 52 78 36 37
              --------------------------

after pass 13: 18 27 28 39 43 59 60 68 76 88* 89 89 95 98 29 96 52 78 36 37
              ----------------------------

after pass 14: 18 27 28 29* 39 43 59 60 68 76 88 89 89 95 98 96 52 78 36 37
              ------------------------------

after pass 15: 18 27 28 29 39 43 59 60 68 76 88 89 89 95 96* 98 52 78 36 37
              --------------------------------

after pass 16: 18 27 28 29 39 43 52* 59 60 68 76 88 89 89 95 96 98 78 36 37
              ----------------------------------

after pass 17: 18 27 28 29 39 43 52 59 60 68 76 78* 88 89 89 95 96 98 36 37
              ------------------------------------

after pass 18: 18 27 28 29 36* 39 43 52 59 60 68 76 78 88 89 89 95 96 98 37
              --------------------------------------

after pass 19: 18 27 28 29 36 37* 39 43 52 59 60 68 76 78 88 89 89 95 96 98
              ----------------------------------------

Sorted array:
18 27 28 29 36 37 39 43 52 59 60 68 76 78 88 89 89 95 96 98

Could an algorithm be labeled as inefficient to be coded differently to improve its performance?

Here is an example of the Selection Sort algorithm. Here we are using syntax that passes parameters by reference to a method. Note that the CLR views the ref and out keywords as doing the same thing. They produce the same IL, apparently. Examine this brief code snippet:

C#
using System;
public sealed class Program {
    public static void Main() {
        Int32 x = 5; 
        InsertValue(ref x); 
        Console.WriteLine(x);     // output is "15"
    }
    private static void InsertValue(ref Int32 v) {
        v += 10; // This method can use the initialized value in v.
    }
}

If a method's parameter is marked with ref, the caller must initialize the parameter's value prior to calling the method. The called method can read from the value and/or write to the value:

C#
using System;
public class Program {

    public static void SelectionSort(ref int[] x)
    {
        int i, j;
        int min, temp;

        for (i = 0; i < x.Length - 1; i++)
        {
            min = i;
            for (j = i + 1; j < x.Length; j++)
            {
                if (x[j] < x[min])
                {
                    min = j;
                }
            }
            temp = x[i];
            x[i] = x[min];
            x[min] = temp;
        }
    }

    private static void UnSortData(ref int[] x, Random rdn)
    {
        for (int i = 0; i <= x.Length - 1; i++)
        {
            x[i] = (int)(rdn.NextDouble() * x.Length);
        }
    }

    public static void PrintData(ref int[] xArray, char status, string sortname)
    {
        if (status == 'a')
            Console.WriteLine("After sorting using algorithm: " + sortname);
        else
            Console.WriteLine("Before sorting");
        for (int i = 0; i <= xArray.Length - 1; i++)
        {
            if ((i != 0) && (i % 10 == 0))
                Console.Write("\n");
            Console.Write(xArray[i] + " ");
        }
        Console.ReadLine();
    }

    public static void Main()
    {
        const int items = 40;
        Random rnd = new Random(items);
        int[] data = new int[items];
        UnSortData(ref data, rnd);
        PrintData(ref data, 'b', "");
        SelectionSort(ref data);
        Console.WriteLine();
        PrintData(ref data, 'a', "SelectionSort");
    }
}

Capture3.JPG

C#
using System;
public class SelectionSort
{
    private int[] data; 
    private static Random generator = new Random();

    public SelectionSort( int size )
    {
        data = new int[ size ]; 

        for ( int i = 0; i < size; i++ )
            data[ i ] = generator.Next( 10, 100 );
    } 
                                 
    public void Sort()
    {
        int smallest;

        // loop over data.Length - 1 elements
        for ( int i = 0; i < data.Length - 1; i++ )
        {
            smallest = i;

            for ( int index = i + 1; index < data.Length; index++ )
                if ( data[ index ] < data[ smallest ] )
                    smallest = index;
            Swap( i, smallest ); 
            PrintData( i + 1, smallest ); 
        }                                            
    }                                             
                   
    public void Swap( int first, int second )
    {
        int temporary = data[ first ];   
        data[ first ] = data[ second ]; 
        data[ second ] = temporary;      
    }                                           

    public void PrintData( int pass, int index )
    {
        Console.Write( "after pass {0}: ", pass );

        // output elements through the selected item
        for ( int i = 0; i < index; i++ )
            Console.Write( data[ i ] + "  " );

        Console.Write( data[ index ] + "* " ); 

        // finish outputting array
        for ( int i = index + 1; i < data.Length; i++ )
            Console.Write( data[ i ] + "  " );

        Console.Write( "\n              " ); 

        // indicate amount of array that is sorted
        for ( int j = 0; j < pass; j++ )
            Console.Write( "--  " );
        Console.WriteLine( "\n" ); 
    } 

    public override string ToString()
    {
        string temporary = string.Empty;

        // iterate through array
        foreach ( int element in data )
            temporary += element + "  ";

        temporary += "\n"; 
        return temporary;
    } 
} 

public class Program
{
    public static void Main()
    {
        // create object to perform selection sort
        SelectionSort sortArray = new SelectionSort( 15 );

        Console.WriteLine( "Unsorted array:" );
        Console.WriteLine( sortArray );  

        sortArray.Sort(); 

        Console.WriteLine( "Sorted array:" );
        Console.WriteLine( sortArray ); 
    } 
}

The results show the actual process described above taking place:

Unsorted array:
48  43  61  55  30  88  30  77  10  61  42  36  23  25  30

after pass 1: 10  43  61  55  30  88  30  77  48* 61  42  36  23  25  30
              --

after pass 2: 10  23  61  55  30  88  30  77  48  61  42  36  43* 25  30
              --  --

after pass 3: 10  23  25  55  30  88  30  77  48  61  42  36  43  61* 30
              --  --  --

after pass 4: 10  23  25  30  55* 88  30  77  48  61  42  36  43  61  30
              --  --  --  --

after pass 5: 10  23  25  30  30  88  55* 77  48  61  42  36  43  61  30
              --  --  --  --  --

after pass 6: 10  23  25  30  30  30  55  77  48  61  42  36  43  61  88*
              --  --  --  --  --  --
    and  so on . . . . . 
              --  --  --  --  --  --  --  --  --  --

after pass 11: 10  23  25  30  30  30  36  42  43  48  55  77* 61  61  88
              --  --  --  --  --  --  --  --  --  --  --

after pass 12: 10  23  25  30  30  30  36  42  43  48  55  61  77* 61  88
              --  --  --  --  --  --  --  --  --  --  --  --

after pass 13: 10  23  25  30  30  30  36  42  43  48  55  61  61  77* 88
              --  --  --  --  --  --  --  --  --  --  --  --  --

after pass 14: 10  23  25  30  30  30  36  42  43  48  55  61  61  77* 88
              --  --  --  --  --  --  --  --  --  --  --  --  --  --

Sorted array:
10  23  25  30  30  30  36  42  43  48  55  61  61  77  88

The QuickSort algorithm: Said to be the fastest sorting algorithm

The topic of algorithms always involves an analysis of performance, mainly speed. QuickSort is probably the most popular sorting algorithm. The written code has been generating random numbers that result in an unordered sequence. We take this data, scramble it, feed it in to the sorting algorithm, and then display the results. We have not been using a System.Diagnostics timer to measure the performance, however. Suffice to believe that documentation insists that QuickSort is the fastest algorithm for sorting. QuickSort uses a sort of "divide and conquer" method for sorting that begins by first partitioning the input list into two parts. Each partition is then sorted separately by recursively calling itself over and over again until the entire list is sorted. A recursive function, as we well know, is a function that calls itself, similar to the factorial process in math. Examine the code, notice that we pass parameters to the method by reference (using the "ref" keyword), and how the code partitions the input list. To the eye, this would appear more efficient than sifting through the input list one at a time and then placing according to numerical value:

C#
using System;

public static class QuickSortAlg
{
    public static void QuickSort(ref int[] x)
    {
        qs(x, 0, x.Length - 1);
    }

    public static void qs(int[] x, int left, int right)
    {
        int i, j;
        int pivot, temp;

        i = left;
        j = right;
        pivot = x[(left + right) / 2];

        do
        {
            while ((x[i] < pivot) && (i < right)) i++;
            while ((pivot < x[j]) && (j > left)) j--;

            if (i <= j)
            {
                temp = x[i];
                x[i] = x[j];
                x[j] = temp;
                i++; j--;
            }
        } while (i <= j);

        if (left < j) qs(x, left, j);
        if (i < right) qs(x, i, right);
    }

    public  static void ScrambleData(ref int[] x, Random rdn)
    {
        for (int i = 0; i <= x.Length - 1; i++)
        {
            x[i] = (int)(rdn.NextDouble() * x.Length);
        }
    }

    public static void PrintData(ref int[] xArray, char status, string sortname)
    {
        if (status == 'a')
            Console.WriteLine("After sorting using algorithm: " + sortname);
        else
            Console.WriteLine("Before sorting");
        for (int i = 0; i <= xArray.Length - 1; i++)
        {
            if ((i != 0) && (i % 10 == 0))
                Console.Write("\n");
            Console.Write(xArray[i] + " ");
        }
        Console.ReadLine();
    }

    public static void Main(string[] args)
    {
        const int items = 30;
        Random rdn = new Random(items);
        int[] info = new int[items];
        ScrambleData(ref info, rdn);
        PrintData(ref info, 'b', "");
        Console.WriteLine();
        QuickSort(ref info);
        PrintData(ref info, 'a', "QuickSort");
        Console.WriteLine("\n\n");
    }
}

The output appears the same as the other outputs. Technical documentation insists that "in practice" QuickSort is the fastest. This could imply that it works better with larger amounts of data that is unsorted. This is what the output should look like:

Before sorting
11 18 22 28 22 23 10 14 20 10
11 26 7 2 29 22 14 17 14 27
24 6 1 15 19 24 21 8 24 7

After sorting using algorithm: QuickSort
1 2 6 7 7 8 10 10 11 11
14 14 14 15 17 18 19 20 21 22
22 22 23 24 24 24 26 27 28 29

Algorithms are widely used to transform data as well. The output data is typically the input into a series of steps that perform arithmetic and/or logical operations on that data. In this case, however, an input array is traversed, temporarily storing values that it will need to retrieve to place in their proper location, to output an ordered sequence of data. This could apply to any storehouse of data: employee ID numbers, names, locations, etc...

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)