Introduction
In this article I'm going to show you my personal implementation of MergeSort algorithm, realized with C#. The scope of this implementation
is not to find the faster way to implement the famous algorithm (you can easily find dozens of implementations with a quick search). The scope is to create and share an implementation based on classes and objects, trying to organize and decouple the code that usually is completely written in a big unreadable method. Voluntarily, I didn't use any function or object from .NET libraries, because I wanted to maintain the solution "platform independent"
as much as possible.
In my approach I'll try to divide responsibilities, putting it in different classes and writing (hopefully) a "clean code". Still, I'll support my implementation with a small test suite I created to develop the algorithm with a TDD approach.
The scope of this article is not to teach you about something, but simply to share with you what I made, to collect as many suggestions and improvements as possible, so feel free to make your observations.
The original MergeSort algorithm
Below you can find the description of MergeSort algorithm, taken from Wikipedia. Conceptually, the algorithm starts from an unsorted list and works in the following way:
- Divide the unsorted list into n sublists, each containing one element (a list of one element is considered sorted).
- Repeatedly merge sublists to produce new sublists until there is only one sublist remaining. (This will be the sorted list.)
The pseudo code usually needed to implement the algorithm contains these two functions:
- merge_sort function
- Merge function
And it appears like this (from Wikipedia):
function merge_sort(list m)
if length(m) <= 1
return m
var list left, right
var integer middle = length(m) / 2
for each x in m before middle
add x to left
for each x in m after or equal middle
add x to right
left = merge_sort(left)
right = merge_sort(right)
return merge(left, right)
function merge(left, right)
var list result
while length(left) > 0 or length(right) > 0
if length(left) > 0 and length(right) > 0
if first(left) <= first(right)
append first(left) to result
left = rest(left)
else
append first(right) to result
right = rest(right)
else if length(left) > 0
append first(left) to result
left = rest(left)
else if length(right) > 0
append first(right) to result
right = rest(right)
end while
return result
Mmmh....this pseudo-code version seems to do a lot of things in the same functions: check
length of arrays, compare numbers, manage iterations...actually too much things for only two functions. This is a very good example on how to try decompose responsibilities. Let's go.
How my code works
Well, looking into this MergeSort I found three big responsibilities, that describe the main steps of the algorithm. Running MergeSort, we always need to:
- Split original array in parts smaller than two elements;
- Sort the split arrays with a compare between elements;
- Merge the sorted arrays;
For this behaviors, I created three objects, named
ArraySplitter
ComparableArray
Merger
Everything is driven from a class named "MergeSort
" that uses this three objects in a very simple way, doing only three actions:
- Split arrays
- Sort split arrays
- Merge arrays
And so, if you look at the code of this class, you can see this:
public int[] Sort(int[] arrayToSort)
{
if (arrayToSort.Length==1)
return arrayToSort;
if (arrayToSort.Length == 2)
return SortTwoElements(arrayToSort[0], arrayToSort[1]);
var splitter = new ArraySplitter(arrayToSort);
int[] splittedArrayA = splitter.GetFirstPart();
int[] splittedArrayB = splitter.GetSecondPart();
int[] sortedArrayA = Sort(splittedArrayA);
int[] sortedArrayB = Sort(splittedArrayB);
var merger = new Merger();
return merger.Merge(sortedArrayA,sortedArrayB);
}
private int[] SortTwoElements(int firstElement, int secondElement)
{
var min = (Math.Abs(firstElement + secondElement) - Math.Abs(firstElement - secondElement)) / 2;
var max = (Math.Abs(firstElement + secondElement) + Math.Abs(firstElement - secondElement)) / 2;
return new[] { min, max };
}
And finally, now, we are able to understand who is doing what, because at this level we don't see all the "technical stuff" or better we
don't want to see how things are done, because this is something that we want to delegate to internal classes.
At this level we want only to see that the algorithm executes three main steps:
- Split
- Sort
- Merge
A closer look to the code
ArraySplitter
This is the class that allows you to split the array in two parts. Special case is required when you have an array with odd dimension.
public class ArraySplitter
{
private readonly int[] _arrayToSort;
private readonly int[] _firstPart;
private readonly int[] _secondPart;
public ArraySplitter(int[] arrayToSort)
{
_arrayToSort = arrayToSort;
_firstPart = new int[_arrayToSort.Length / 2];
_secondPart = new int[_arrayToSort.Length - _firstPart.Length];
}
public int[] GetFirstPart()
{
for (int i = 0; i < (_arrayToSort.Length / 2); i++)
_firstPart[i] = _arrayToSort[i];
return _firstPart;
}
public int[] GetSecondPart()
{
for (int i = 0; i < _arrayToSort.Length - (_arrayToSort.Length / 2); i++)
_secondPart[i] = _arrayToSort[_firstPart.Length + i];
return _secondPart;
}
}
Merger
This class is able to merge two sorted array and to return a new array that keeps elements sorted. This array is built by taking out always the smallest element between the two arrays and the only responsibility related to this class is
to merge the elements. The responsibility to decide which is the smallest element is delegated to the object "ComparableArray
" (see below).
public class Merger
{
public int[] Merge(int[] sortedArrayA, int[] sortedArrayB)
{
var mergedArray = new int[sortedArrayA.Length + sortedArrayB.Length];
var visitableArrayA = new ComparableArray(sortedArrayB);
var visitableArrayB = new ComparableArray(sortedArrayA);
for (int i = 0; i < mergedArray.Length; i++)
mergedArray[i] = visitableArrayB.GetNextSmallerElement(visitableArrayA);
return mergedArray;
}
}
ComparableArray
This is the "special" array that contains, inside, the logic to retrieve the next smaller element comparing itself to another arrays like its. Probably the name is not the best choice....
Anyway, this array maintains a sort of "window" always opened on the next element to compare. All the logic needed to move through the array elements and compare a new element if the previous was already picked up, is encapsulated inside the array itself. The code needs to consider the case when the array is bigger (or smaller) than the compared one. In the snippet below I'll show you the small piece of code that take care of this but you can look at the full implementation going into the code.
public class ComparableArray
{
...
public int GetNextSmallerElement(ComparableArray arrayToCompare)
{
if (CanPickNext)
{
if (arrayToCompare.CanPickNext)
{
if (Current < arrayToCompare.Current)
return GetAndMove();
else
return arrayToCompare.GetAndMove();
}
return GetAndMove();
}
return arrayToCompare.GetAndMove();
}
...
}
History
- 2012-07-17: First version of the article.
- 2012-07-18: Enhanced description of how the code works.
- 2012-08-16: Included a new implementation of the method "
SortTwoElements
" that now can sort elements without use an IF comparison. This implementation has been provided from G.Giordano, that I would like to thank for his suggestion. - 2012-11-14: The implementation of the method "
SortTwoElements
" is now shown in the code snippet in the section "How my code works".