Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / algorithm

Merge Sort algorithm developed with an Object Oriented approach

4.56/5 (8 votes)
14 Nov 2012CPOL4 min read 45.3K   418  
A clean version of merge sort algorithm, implemented with classes and objects instead of the usual, unreadable big function.

Introduction

In this article I'm going to show you my personal implementation of MergeSort algorithm, realized with C#. The scope of this implementation is not to find the faster way to implement the famous algorithm (you can easily find dozens of implementations with a quick search). The scope is to create and share an implementation based on classes and objects, trying to organize and decouple the code that usually is completely written in a big unreadable method. Voluntarily, I didn't use any function or object from .NET libraries, because I wanted to maintain the solution "platform independent" as much as possible.

In my approach I'll try to divide responsibilities, putting it in different classes and writing (hopefully) a "clean code". Still, I'll support my implementation with a small test suite I created to develop the algorithm with a TDD approach.

The scope of this article is not to teach you about something, but simply to share with you what I made, to collect as many suggestions and improvements as possible, so feel free to make your observations.

The original MergeSort algorithm

Below you can find the description of MergeSort algorithm, taken from Wikipedia. Conceptually, the algorithm starts from an unsorted list and works in the following way:

  1. Divide the unsorted list into n sublists, each containing one element (a list of one element is considered sorted).
  2. Repeatedly merge sublists to produce new sublists until there is only one sublist remaining. (This will be the sorted list.)

The pseudo code usually needed to implement the algorithm contains these two functions:

  • merge_sort function
  • Merge function

And it appears like this (from Wikipedia):

function merge_sort(list m)
    // if list size is 1, consider it sorted and return it
    if length(m) <= 1
        return m
    // else list size is > 1, so split the list into two sublists
    var list left, right
    var integer middle = length(m) / 2
    for each x in m before middle
         add x to left
    for each x in m after or equal middle
         add x to right
    // recursively call merge_sort() to further split each sublist
    // until sublist size is 1
    left = merge_sort(left)
    right = merge_sort(right)
    // merge the sublists returned from prior calls to merge_sort()
    // and return the resulting merged sublist
    return merge(left, right)

function merge(left, right)
    var list result
    while length(left) > 0 or length(right) > 0
        if length(left) > 0 and length(right) > 0
            if first(left) <= first(right)
                append first(left) to result
                left = rest(left)
            else
                append first(right) to result
                right = rest(right)
        else if length(left) > 0
            append first(left) to result
            left = rest(left)
        else if length(right) > 0
            append first(right) to result
            right = rest(right)
    end while
    return result

Mmmh....this pseudo-code version seems to do a lot of things in the same functions: check length of arrays, compare numbers, manage iterations...actually too much things for only two functions. This is a very good example on how to try decompose responsibilities. Let's go.

How my code works

Well, looking into this MergeSort I found three big responsibilities, that describe the main steps of the algorithm. Running MergeSort, we always need to:

  • Split original array in parts smaller than two elements;
  • Sort the split arrays with a compare between elements;
  • Merge the sorted arrays;

For this behaviors, I created three objects, named

  • ArraySplitter
  • ComparableArray
  • Merger

Everything is driven from a class named "MergeSort" that uses this three objects in a very simple way, doing only three actions:

  • Split arrays
  • Sort split arrays
  • Merge arrays

And so, if you look at the code of this class, you can see this:

C#
public int[] Sort(int[] arrayToSort)
{
    if (arrayToSort.Length==1)
        return arrayToSort;

    if (arrayToSort.Length == 2)
    return SortTwoElements(arrayToSort[0], arrayToSort[1]);

    var splitter = new ArraySplitter(arrayToSort);
    int[] splittedArrayA = splitter.GetFirstPart();
    int[] splittedArrayB = splitter.GetSecondPart();
            
    int[] sortedArrayA = Sort(splittedArrayA);
    int[] sortedArrayB = Sort(splittedArrayB);

    var merger = new Merger();
    return merger.Merge(sortedArrayA,sortedArrayB);
}

private int[] SortTwoElements(int firstElement, int secondElement)
{
    var min = (Math.Abs(firstElement + secondElement) - Math.Abs(firstElement - secondElement)) / 2;
    var max = (Math.Abs(firstElement + secondElement) + Math.Abs(firstElement - secondElement)) / 2;
    return new[] { min, max };
}

And finally, now, we are able to understand who is doing what, because at this level we don't see all the "technical stuff" or better we don't want to see how things are done, because this is something that we want to delegate to internal classes.

At this level we want only to see that the algorithm executes three main steps:

  1. Split
  2. Sort
  3. Merge

A closer look to the code

ArraySplitter

This is the class that allows you to split the array in two parts. Special case is required when you have an array with odd dimension.
C#
public class ArraySplitter
{
    private readonly int[] _arrayToSort;
    private readonly int[] _firstPart;
    private readonly int[] _secondPart;

    public ArraySplitter(int[] arrayToSort)
    {
        _arrayToSort = arrayToSort;
        _firstPart = new int[_arrayToSort.Length / 2];
        _secondPart = new int[_arrayToSort.Length - _firstPart.Length];
    }

    public int[] GetFirstPart()
    {
        for (int i = 0; i < (_arrayToSort.Length / 2); i++)
            _firstPart[i] = _arrayToSort[i];

        return _firstPart;
    }

    public int[] GetSecondPart()
    {
        for (int i = 0; i < _arrayToSort.Length - (_arrayToSort.Length / 2); i++)
            _secondPart[i] = _arrayToSort[_firstPart.Length + i];
        
        return _secondPart;
    }
}

Merger

This class is able to merge two sorted array and to return a new array that keeps elements sorted. This array is built by taking out always the smallest element between the two arrays and the only responsibility related to this class is to merge the elements. The responsibility to decide which is the smallest element is delegated to the object "ComparableArray" (see below).

C#
public class Merger
{
    public int[] Merge(int[] sortedArrayA, int[] sortedArrayB)
    {
        var mergedArray = new int[sortedArrayA.Length + sortedArrayB.Length];
        var visitableArrayA = new ComparableArray(sortedArrayB);
        var visitableArrayB = new ComparableArray(sortedArrayA);

        for (int i = 0; i < mergedArray.Length; i++)
            mergedArray[i] = visitableArrayB.GetNextSmallerElement(visitableArrayA);
        
        return mergedArray;
    }
}

ComparableArray

This is the "special" array that contains, inside, the logic to retrieve the next smaller element comparing itself to another arrays like its. Probably the name is not the best choice....

Anyway, this array maintains a sort of "window" always opened on the next element to compare. All the logic needed to move through the array elements and compare a new element if the previous was already picked up, is encapsulated inside the array itself. The code needs to consider the case when the array is bigger (or smaller) than the compared one. In the snippet below I'll show you the small piece of code that take care of this but you can look at the full implementation going into the code.

C#
public class ComparableArray
{
    ...

    public int GetNextSmallerElement(ComparableArray arrayToCompare)
    {
        if (CanPickNext)
        {
            if (arrayToCompare.CanPickNext)
            {
                if (Current < arrayToCompare.Current)
                    return GetAndMove();
                else
                    return arrayToCompare.GetAndMove();
            }
            return GetAndMove();
        }

        return arrayToCompare.GetAndMove();
    }

    ...
}

History

  • 2012-07-17: First version of the article.
  • 2012-07-18: Enhanced description of how the code works.
  • 2012-08-16: Included a new implementation of the method "SortTwoElements" that now can sort elements without use an IF comparison. This implementation has been provided from G.Giordano, that I would like to thank for his suggestion.
  • 2012-11-14: The implementation of the method "SortTwoElements" is now shown in the code snippet in the section "How my code works".

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)