Introduction
QuickSort was intended to be a very fast sort algorithm, with minimum execution time, a small code footprint, and the ability to sort a large quantity of data. I have been prompted to revisit QuickSort, a well thrashed topic, because several submissions and implementations have compromised speed for flexibility, and simplicity for one-size-fits-all complexity. Herein are suggestions for addressing those shortcomings, with code examples and a variety of implementations.
In this article, I will briefly cover several points of common interest, with some code examples. These points of interest are:
- My original, very brief QuickSort algorithm implemented in VB.NET. I do not lay claim to inventing the algorithm. The implementation is very quick with no clutter.
- An implementation of the algorithm that sorts 'By-Reference' rather than sorting 'In-Place'. The implementation is markedly faster than sorting data structures in place.
- Another implementation that is 'Stable' in that it maintains the original order of same-value items.
- A last, general implementation, that provides a class and method that may cater for many of your general QuickSort needs.
In the Beginning
The QuickSort algorithm was developed by C. A. R. Hoare in 1960, while working for the small British scientific computer manufacturer Elliott Brothers (Wikipedia, 2008).
The objective of QuickSort is to sort a large quantity of data in the shortest possible time. The algorithm employed is a recursive algorithm that separates the data into two partitions, 'high' (High) and 'low' (Low), that are sorted to the extent that no High data items are in the Low partition and no Low data items are in the High partition. For this purpose, the algorithm chooses an arbitrary 'pivot' value (Pivot) with which to partition the data. The algorithm does not have to be implemented as recursive, but it recursively executes the same logic for the Low and High partitions separately, thereby eventually sorting the entire data set.
A major disadvantage of the algorithm, in its simplest form, is that it is not a stable sort. In the process of swapping, the original order of items with the same 'value' may be put out of order. This means that raw QuickSort’s value as a transaction sorting engine is limited. However, the stable sort problem can easily be overcome at the expense of machine cycles.
Raw QuickSort In-Place
The following snippet is my original integer QuickSort, in-place algorithm. I do not claim to have invented this algorithm. The snippet is provided without comments to illustrate just how brief it is, and for a mental exercise to allow the reader to fathom how and why it can possibly work. Documented code is contained in the download. The result is a sorted awIntArr as Int32()
array.
Public Shared Sub QuickSortInt(ByRef awIntArr As Int32(), _
ByVal aLoIdx As Int32, ByVal aHiIdx As Int32)
Dim xTmp As Int32
Dim xLo As Int32 = aLoIdx
Dim xHi As Int32 = aHiIdx
Dim xPivot As Int32 = awIntArr((xLo + xHi) \ 2)
Do Until xLo > xHi
Do While awIntArr(xLo) < xPivot
xLo += 1
Loop
Do While awIntArr(xHi) > xPivot
xHi -= 1
Loop
If xLo <= xHi Then
xTmp = awIntArr(xLo)
awIntArr(xLo) = awIntArr(xHi)
awIntArr(xHi) = xTmp
xLo += 1
xHi -= 1
End If
Loop
If (xLo < aHiIdx) Then QuickSortInt(awIntArr, xLo, aHiIdx)
If (xHi > aLoIdx) Then QuickSortInt(awIntArr, aLoIdx, xHi)
End Sub
Notes
I acknowledge that the Pivot value should be chosen at random to avoid worst case sorting of O(n2) instead of the expected O(nLog2n). However, I perversely continue to choose the central index to avoid this already-sorted issue.
This code executes at under two seconds (on my drudge machine) in sorting one million integers ten times (see the samples). The 'bloated' common versions, one of which is included in the samples, take approximately 50% longer.
QuickSort By-Reference
QuickSort By-Reference does not move any of the original unsorted data, it simply sorts an array of indices that reference that data. The following snippet accepts an array of BigRecord
s, sorts on BigRecord.CustNo
, and avoids moving any records around. Instead, it moves 32-bit integer references about, a breeze for a 32-bit machine. The result is a sorted awRefArr as Int32()
array of references to the awBigRecArr as BigRecord()
array of unsorted, original data. One bonus side effect is that when you have sorted your index array, you have also sorted your data in reverse order – you just read the resultant index array in reverse.
Public Shared Sub QuickSortBigRecord _
(ByRef awBigRecArr As BigRecord(), ByRef awRefArr As Int32(), _
ByVal aStartIndex As Int32, ByVal aEndIndex As Int32)
Dim xTmp As Int32
Dim xLo As Int32 = aStartIndex
Dim xHi As Int32 = aEndIndex
Dim xPivot As Int32 = awRefArr((xLo + xHi) \ 2)
Do Until (xLo > xHi)
Do While awBigRecArr(awRefArr(xLo)).CustNo < awBigRecArr(xPivot).CustNo
xLo += 1
Loop
Do While awBigRecArr(awRefArr(xHi)).CustNo > awBigRecArr(xPivot).CustNo
xHi -= 1
Loop
If xLo <= xHi Then
xTmp = awRefArr(xLo)
awRefArr(xLo) = awRefArr(xHi)
awRefArr(xHi) = xTmp
xLo += 1
xHi -= 1
End If
Loop
If xLo < aEndIndex Then
QuickSortBigRecord(awBigRecArr, awRefArr, xLo, aEndIndex)
If xHi > aStartIndex Then
QuickSortBigRecord(awBigRecArr, awRefArr, aStartIndex, xHi)
End Sub
Notes
I generally use a reference QuickSort.
Stable QuickSort
When sorting data records, it is often useful to have the sequence of the data records preserved when items return the same value during the compare. Transactions that sell the same product to the same customer repeatedly over time may need to be kept 'Stable', in transaction date order, for some applications. QuickSort does not do this in its most efficient form ("Raw QuickSort"). The following changes to the "QuickSort By-Reference" snippet (above) achieve this with very little coding, without adding too many machine cycles to the processing.
Do While (awBigRecArr(awRefArr(xLo)) < awBigRecArr(xPivot)) _
OrElse ((awBigRecArr(awRefArr(xLo)) = awBigRecArr(xPivot)) _
AndAlso (awRefArr(xLo) < xPivot))
xLo += 1
Loop
Do While (awBigRecArr(awRefArr(xHi)) > awBigRecArr(xPivot)) _
OrElse ((awBigRecArr(awRefArr(xHi)) = awBigRecArr(xPivot)) _
AndAlso (awRefArr(xHi) > xPivot))
xHi -= 1
Loop
Notes
I generally have little need for a stable QuickSort; however, it consumes only a small percentage of additional machine cycles.
General QuickSort
I have seen at The Code Project and elsewhere, numerous exercises in generalising QuickSort to meet every situation. These efforts have, in many cases, led to bloated, cluttered, and less efficient code. My attempt here simply uses a base class that must be inherited, that implements IComparable
. It is reasonably trivial to add your particular data type to a descendant class, implement IComparable.CompareTo
, then call the generalised QuickSort.
The base class:
Public MustInherit Class clsBaseIComparable
Implements IComparable(Of clsBaseIComparable)
Public MustOverride Function CompareTo(ByVal other As clsBaseIComparable) _
As Integer Implements System.IComparable(Of clsBaseIComparable).CompareTo
End Class
A descendant class:
Public Class clsEmployee
Inherits clsBaseIComparable
Friend fFirstName As String
Friend fLastName As String
Friend fRole As String
Friend fSalary as Int32
Public Sub New _
(ByVal aFirstName As String _
, ByVal aLastName As String _
, ByVal aRole As String _
, ByVal aSalary as Int32)
Me.fFirstName = aFirstName
Me.fLastName = aLastName
Me.fRole = aRole
Me.fSalary = aSalary
End Sub
Public Overrides Function CompareTo(ByVal other As clsBaseIComparable) As Integer
If Me.fRole < DirectCast(other, clsEmployee).fRole Then
Return -1
ElseIf Me.fRole > DirectCast(other, clsEmployee).fRole Then
Return +1
ElseIf Me.fSalary < DirectCast(other, clsEmployee).fSalary Then
Return -1
ElseIf Me.fSalary > DirectCast(other, clsEmployee).fSalary Then
Return +1
ElseIf Me.fLastName < DirectCast(other, clsEmployee).fLastName Then
Return -1
ElseIf Me.fLastName > DirectCast(other, clsEmployee).fLastName Then
Return +1
ElseIf Me.fFirstName < DirectCast(other, clsEmployee).fFirstName Then
Return -1
ElseIf Me.fFirstName > DirectCast(other, clsEmployee).fFirstName Then
Return +1
Else
Return 0
End If
End Function
End Class
Not too grueling so far. Now, for the QuickSort. All that has been added to the "Stable QuickSort" implementation is an xComp
variable to avoid multiple calls to CompareTo
, and the use of CompareTo
in place of greater than and less than:
Public Shared Sub QuickSortEmployee _
(ByRef awEmpArr As clsEmployee(), ByRef awRefArr As Int32(), _
ByVal aStartIndex As Int32, ByVal aEndIndex As Int32)
Dim xTmp As Int32
Dim xLo As Int32 = aStartIndex
Dim xHi As Int32 = aEndIndex
Dim xPivot As Int32 = awRefArr((xLo + xHi) \ 2)
Dim xComp As Int32
Do Until (xLo > xHi)
xComp = awEmpArr(awRefArr(xLo)).CompareTo(awEmpArr(xPivot))
Do While (xComp < 0) OrElse ((xComp = 0) AndAlso (awRefArr(xLo) < xPivot))
xLo += 1
xComp = awEmpArr(awRefArr(xLo)).CompareTo(awEmpArr(xPivot))
Loop
xComp = awEmpArr(awRefArr(xHi)).CompareTo(awEmpArr(xPivot))
Do While (xComp > 0) OrElse ((xComp = 0) AndAlso (awRefArr(xHi) > xPivot))
xHi -= 1
xComp = awEmpArr(awRefArr(xHi)).CompareTo(awEmpArr(xPivot))
Loop
If xLo <= xHi Then
xTmp = awRefArr(xLo)
awRefArr(xLo) = awRefArr(xHi)
awRefArr(xHi) = xTmp
xLo += 1
xHi -= 1
End If
Loop
If xLo < aEndIndex Then QuickSortEmployee(awEmpArr, awRefArr, xLo, aEndIndex)
If xHi > aStartIndex Then QuickSortEmployee(awEmpArr, awRefArr, aStartIndex, xHi)
End Sub
Notes
The generalised QuickSort takes considerably longer to execute. The one included in my samples takes more than five times as long (on my machine) as "Raw QuickSort", probably for two reasons:
- It is calling an external method;
- It is a far more complicated comparison.
I am often encouraged by these disadvantages to instead implement multiple, customised QuickSorts, which often cost little extra coding and execute many times faster.
Summary
QuickSort is clearly a simple, succinct way of sorting data. The code footprint is of such small size that consideration should be given to replicating QuickSort code for different 'sorts' rather than excessively generalising it. It has been shown that it can be a stable sort as well as an easily generalised sort, sorting data directly in place of indirectly by reference. The uncluttered QuickSort is basically (pun intended) an elegant piece of logic – congratulations C. A. R. Hoare (1960).
History
- 2008-01-10 - Original article.