Introduction
Count Sort is a Linear Sorting algorithm which sorts elements in O(n) time. The other common linear sorts include Bucket and Radix sorts.
What is Linear Sorting Algorithm
Linear sorting algorithm is a sorting algorithm that does not use any comparison operator (>, <, >= , <= , ==) to determine the sorting order of the elements. The sorting is achieved by acute logic build, and the overall time taken by the algorithm is hence linear.
Actually, if we contrast linear sort to other comparison sorts with respect to time, we will find that comparison sorts can do n log n at their best and exponential at worse in terms of time. The linear sort gives linear performance and thus has fine edge in time over these algorithms.
Note: Along with time, we also have to look at space/memory usage of a particular algorithm, and actually this is exactly where Linear Sort falls back as Linear Sort uses extra space.
The Internal Working of Count Sort
The Count Sort is quite simple as compared to other sorting algorithms (not even a single nested loop). I will explain the logical working of count sort, and for plainness purpose, use array data structure.
There are only two classes in the code, namely:
CountSort
: This class contains methods Max
, Count_Sort()
and Display()
, for finding maximum elements in array, sorting the array with Count Sort, and for displaying all the elements in the array, respectively.
CountSortApp
: This class contains Main()
method that contains CountSort
class object and calls to methods of CountSort
, and does nothing more.
The CountSort
class has only two attributes:
class CountSort
{
int []thearray;
int i;
…..
Now, let's go through the constructor of CountSort
.
public CountSort(int size)
{
thearray = new int [size] ;
Random ran = new Random();
for(int i = 0 ; i< size; i++)
{
thearray[i] = ran.Next(i,i*size);
}
}
- The constructor takes the size of array and makes
thearray
of this size.
- Initialize
thearray
to random numbers. First, an object of class Random
is declared, i.e., ran
. Its method Next(lowerbound, upperbound)
is used to generate random number at each iteration and equate it to thearray [ i ]
. So the thearray
is filled up with random numbers.
Next is the Max()
method. Actually, the sole purpose of this method is to find the maximum number in the thearray
.
private int Max()
{
int max = thearray[0];
for(i = 1 ; i<thearray.Length ;i++)
{
if (thearray[i] > max )
{
max = thearray[i] ;
}
}
return max;
}
Let's go through this method’s code step by step.
First, let's take the first element in the array to be the maximum number in the whole array, i.e., max = thearray[0]
. Now iterate over the whole array except the first element (as we already select it as the maximum element). Whenever you find an element greater than the value contained in variable max
, just replace the value of max
with this value. Actually, the check if (thearray[i] > max)
does the checking of element at index i
with the current value of max
, and in the case it is greater, replaces it with max = thearray[i]
.
Now we come to the Count_Sort
method, which contains all the real work. First, there is the declaration for variables used by the CountSort
method.
int k = Max();
int []output = new int[thearray.Length];
int []temp = new int[k+1];
- Declaration of variable
k
which holds the maximum element in thearray
obtained by calling Max()
method.
- Declaration of array output of type
int
, equaling in size of the original array. The output array will have final sorted values, as you can see that in order to have original array sorted, you have to copy from output array to original array, i.e., thearray
. I have more to say about this and other crucial facts in the last section.
- Finally, the declaration of array
temp
, which will provide the indexing up to maximum value in thearray
. temp
array is of size k
+1, since in C#, in order to have indexing up to ‘k
’ we need to declare it of size k
+1.
Now, we have a bunch of for
loops. I will go through each of them separately, in order to clarify their functionality.
The first for
loop:
for( i= 0 ; i<k+1 ; i++)
{
temp[ i ] = 0;
}
The is initialization of the array temp
to 0. Actually, this step is done in order to keep clarity and pace with the algorithm. What is done is only to make sure that every element of temp
is set to zero, as this zero will play an important role later.
The second for
loop:
for(i = 0 ; i<thearray.Length ; i++)
{
temp [ thearray [ i ] ] = temp [ thearray [ i ] ] + 1;
}
This loop contains some elaborate logic, worth explaining. Let's consider the original array thearray
that contained values {2,1,3,2,4}. So the value of k
will be 4, and we have array temp
of size k
+1 = 5, and an output array of size same as thearray
i.e., 5. It can be shown diagrammatically as below:
The statement temp [ thearray [ i ] ] = temp [ thearray [ i ] ] + 1
can be broken into:
int p = thearray [ i ] ;
temp [ p ] = temp [ p ] + 1 ;
The element at thearray [ i ]
is saved in variable p
and this value is used for indexing the array temp
, and finally adding one to the value that is obtained by indexing.
In temp
array, every index corresponds to the element / possible element of original unsorted array thearray
. For e.g., the index number 2 in temp
corresponds to the element 2 in thearray
. See also that index 0 does not have corresponding element 0 in thearray
. So the statement temp [ p ] +1
will actually mean that we have an occurrence element p
, so increment it by one so its number of occurrences can be captured. Well, I also feel pretty much hazy, so let's take a look at what happened for different values of i
and get the picture right.
For i
= 0:
p = thearray [ 0 ] ;
temp [ 2 ] = temp [ 2 ] + 1 ;
temp [ 2 ] = 1
Now, for i
= 1:
p = thearray [ 1 ] ;
temp [ 3 ] = temp [ 3 ] + 1 ;
Now, for i
= 3:
p = thearray [ 3 ] ;
temp [ 2 ] = temp [ 2 ] + 1 ;
And so on for the other values of i
. After which we have temp
containing occurrences of each and every element of thearray
. This is illustrated in the figure below.
Now, let's move to the third for
loop:
for( i = 1 ; i<k+1 ; i++)
{
temp[ i ] = temp[ i ] + temp[ i - 1];
}
The loop simply runs up to size of the array temp
. The main reason of initializing i
= 1 will be clear as I explain what we are trying to do in this loop. The main idea is to start with 2nd element (at index 1) of the array temp
, and add all previous elements including itself (in the case of the 2nd element, these are elements at indexes 0 and 1 ), then move to next element 3rd (at index 2) and again add all previous elements including itself (in the case of the 3rd element, these are elements at indexes 0, 1 and 2), and so on till the end of array temp
. Since there is no element prior to 1st element (at index 0), i
is set to 1 in the start.
The temp
array after execution of this loop is shown in the following figure:
and now the forth and final for
loop:
for(i = thearray.Length-1 ; i>=0 ; i--)
{
output [ temp [ thearray [ i ] ] -1 ] = thearray [ i ] ;
temp [ thearray [ i ] ] = temp [ thearray [ i ] ] - 1;
}
Now break the statement in the loop as:
int p = thearray [ i ];
temp [ p ] = temp [ p ] - 1;
We initialize variable p
with the elements of thearray
one by one. Next, we use p
as an index for the array temp
, we get the value at that index, decrement it by 1, and then put this decremented value back in array
temp at index p
. (See figure below.)
Well, here is the explanation: as we are building the final sorted array in this very last loop, these statements actually occur due to the fact that the values in array temp
are used by the sorted array output as an index, and the value that is just accessed by output array must be decremented by one or else we will not able to put the same elements found in the original array at consecutive locations in the final sorted array output.
This fact is demonstrated in the following figures:
For i
=4:
For i
=1:
When temp
is not decremented by 1, note that the element 3 is inserted at index 4 again (below):
The final sorted array output is shown in the figure below. Also shown is the state of array temp
after final step.
Conclusion
In this article, I have shown how to implement the Count Sort, a sorting algorithm that sorts elements in Linear Time. Well, there are other factors that one must consider before using Count Sort. The Count Sort is stable, means numbers with the values appear in the output array in the same order as they do in input array. The fact that there is no comparison used in by this algorithm makes it difficult if not impossible to use it with custom objects rather than primitive data types; and even in the case of primitive types, the algorithm can perform much worse in terms of memory usage if only single element in the input set is close to maximum range for that type.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.