Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

QuickSort Revisited, Optimised, and Stabilised

4.55/5 (13 votes)
10 Jan 2008CPOL5 min read 1   426  
QuickSort revisited, with optimisations to minimise machine cycles, stabilised to retain original order, and generalised for convenience.

Introduction

QuickSort was intended to be a very fast sort algorithm, with minimum execution time, a small code footprint, and the ability to sort a large quantity of data. I have been prompted to revisit QuickSort, a well thrashed topic, because several submissions and implementations have compromised speed for flexibility, and simplicity for one-size-fits-all complexity. Herein are suggestions for addressing those shortcomings, with code examples and a variety of implementations.

In this article, I will briefly cover several points of common interest, with some code examples. These points of interest are:

  • My original, very brief QuickSort algorithm implemented in VB.NET. I do not lay claim to inventing the algorithm. The implementation is very quick with no clutter.
  • An implementation of the algorithm that sorts 'By-Reference' rather than sorting 'In-Place'. The implementation is markedly faster than sorting data structures in place.
  • Another implementation that is 'Stable' in that it maintains the original order of same-value items.
  • A last, general implementation, that provides a class and method that may cater for many of your general QuickSort needs.

In the Beginning

The QuickSort algorithm was developed by C. A. R. Hoare in 1960, while working for the small British scientific computer manufacturer Elliott Brothers (Wikipedia, 2008).

The objective of QuickSort is to sort a large quantity of data in the shortest possible time. The algorithm employed is a recursive algorithm that separates the data into two partitions, 'high' (High) and 'low' (Low), that are sorted to the extent that no High data items are in the Low partition and no Low data items are in the High partition. For this purpose, the algorithm chooses an arbitrary 'pivot' value (Pivot) with which to partition the data. The algorithm does not have to be implemented as recursive, but it recursively executes the same logic for the Low and High partitions separately, thereby eventually sorting the entire data set.

A major disadvantage of the algorithm, in its simplest form, is that it is not a stable sort. In the process of swapping, the original order of items with the same 'value' may be put out of order. This means that raw QuickSort’s value as a transaction sorting engine is limited. However, the stable sort problem can easily be overcome at the expense of machine cycles.

Raw QuickSort In-Place

The following snippet is my original integer QuickSort, in-place algorithm. I do not claim to have invented this algorithm. The snippet is provided without comments to illustrate just how brief it is, and for a mental exercise to allow the reader to fathom how and why it can possibly work. Documented code is contained in the download. The result is a sorted awIntArr as Int32() array.

VB
Public Shared Sub QuickSortInt(ByRef awIntArr As Int32(), _
       ByVal aLoIdx As Int32, ByVal aHiIdx As Int32)
  Dim xTmp As Int32
  Dim xLo As Int32 = aLoIdx
  Dim xHi As Int32 = aHiIdx
  Dim xPivot As Int32 = awIntArr((xLo + xHi) \ 2)
  Do Until xLo > xHi
    Do While awIntArr(xLo) < xPivot
      xLo += 1
    Loop
    Do While awIntArr(xHi) > xPivot
      xHi -= 1
    Loop
    If xLo <= xHi Then
      xTmp = awIntArr(xLo)
      awIntArr(xLo) = awIntArr(xHi)
      awIntArr(xHi) = xTmp
      xLo += 1
      xHi -= 1
    End If
  Loop
  If (xLo < aHiIdx) Then QuickSortInt(awIntArr, xLo, aHiIdx)
  If (xHi > aLoIdx) Then QuickSortInt(awIntArr, aLoIdx, xHi)
End Sub

Notes

I acknowledge that the Pivot value should be chosen at random to avoid worst case sorting of O(n2) instead of the expected O(nLog2n). However, I perversely continue to choose the central index to avoid this already-sorted issue.

This code executes at under two seconds (on my drudge machine) in sorting one million integers ten times (see the samples). The 'bloated' common versions, one of which is included in the samples, take approximately 50% longer.

QuickSort By-Reference

QuickSort By-Reference does not move any of the original unsorted data, it simply sorts an array of indices that reference that data. The following snippet accepts an array of BigRecords, sorts on BigRecord.CustNo, and avoids moving any records around. Instead, it moves 32-bit integer references about, a breeze for a 32-bit machine. The result is a sorted awRefArr as Int32() array of references to the awBigRecArr as BigRecord() array of unsorted, original data. One bonus side effect is that when you have sorted your index array, you have also sorted your data in reverse order – you just read the resultant index array in reverse.

VB
Public Shared Sub QuickSortBigRecord _
      (ByRef awBigRecArr As BigRecord(), ByRef awRefArr As Int32(), _
       ByVal aStartIndex As Int32, ByVal aEndIndex As Int32)
  Dim xTmp As Int32                     ' Temp integer.
  Dim xLo As Int32 = aStartIndex        ' Working lower bound index.
  Dim xHi As Int32 = aEndIndex          ' Working upper bound index.
  Dim xPivot As Int32 = awRefArr((xLo + xHi) \ 2) ' Pivot reference.

  ' QuickSort.
  Do Until (xLo > xHi)

    ' Bump our Low Index while items are already in place.
    Do While awBigRecArr(awRefArr(xLo)).CustNo < awBigRecArr(xPivot).CustNo
      xLo += 1
    Loop

    ' Bump our High Index while items are already in place.
    Do While awBigRecArr(awRefArr(xHi)).CustNo > awBigRecArr(xPivot).CustNo
      xHi -= 1
    Loop

    ' Swap the low and high index items, if swappable. Bump indices.
    If xLo <= xHi Then
      xTmp = awRefArr(xLo)
      awRefArr(xLo) = awRefArr(xHi)
      awRefArr(xHi) = xTmp
      xLo += 1
      xHi -= 1
    End If

  Loop

  ' QuickSort the top partition.
  If xLo < aEndIndex Then
      QuickSortBigRecord(awBigRecArr, awRefArr, xLo, aEndIndex)
  ' QuickSort the bottom partition.
  If xHi > aStartIndex Then
      QuickSortBigRecord(awBigRecArr, awRefArr, aStartIndex, xHi)

End Sub

Notes

I generally use a reference QuickSort.

Stable QuickSort

When sorting data records, it is often useful to have the sequence of the data records preserved when items return the same value during the compare. Transactions that sell the same product to the same customer repeatedly over time may need to be kept 'Stable', in transaction date order, for some applications. QuickSort does not do this in its most efficient form ("Raw QuickSort"). The following changes to the "QuickSort By-Reference" snippet (above) achieve this with very little coding, without adding too many machine cycles to the processing.

VB
' Bump our Low Index while items are already in place.
Do While (awBigRecArr(awRefArr(xLo)) < awBigRecArr(xPivot)) _
OrElse ((awBigRecArr(awRefArr(xLo)) = awBigRecArr(xPivot)) _
         AndAlso (awRefArr(xLo) < xPivot))
  xLo += 1
Loop

' Bump our High Index while items are already in place.
Do While (awBigRecArr(awRefArr(xHi)) > awBigRecArr(xPivot)) _
OrElse ((awBigRecArr(awRefArr(xHi)) = awBigRecArr(xPivot)) _
         AndAlso (awRefArr(xHi) > xPivot))
  xHi -= 1
Loop

Notes

I generally have little need for a stable QuickSort; however, it consumes only a small percentage of additional machine cycles.

General QuickSort

I have seen at The Code Project and elsewhere, numerous exercises in generalising QuickSort to meet every situation. These efforts have, in many cases, led to bloated, cluttered, and less efficient code. My attempt here simply uses a base class that must be inherited, that implements IComparable. It is reasonably trivial to add your particular data type to a descendant class, implement IComparable.CompareTo, then call the generalised QuickSort.

The base class:

VB
''' <summary>
''' Base MustInherit class implementing
''' IComparable(Of clsBaseIComparable).
''' </summary>
''' <remarks></remarks>
Public MustInherit Class clsBaseIComparable
  Implements IComparable(Of clsBaseIComparable)

  Public MustOverride Function CompareTo(ByVal other As clsBaseIComparable) _
        As Integer Implements System.IComparable(Of clsBaseIComparable).CompareTo

End Class

A descendant class:

VB
''' <summary>
''' Employee data, inherited from clsBaseIComparable,
''' implementing IComparable.
''' </summary>
''' <remarks>
''' Devised for testing general QuickSort methods.
''' </remarks>
Public Class clsEmployee
  Inherits clsBaseIComparable

  Friend fFirstName As String
  Friend fLastName As String
  Friend fRole As String
  Friend fSalary as Int32

  Public Sub New _
        (ByVal aFirstName As String _
        , ByVal aLastName As String _
        , ByVal aRole As String _
        , ByVal aSalary as Int32)
    Me.fFirstName = aFirstName
    Me.fLastName = aLastName
    Me.fRole = aRole
    Me.fSalary = aSalary
  End Sub

  ''' <summary>
  ''' Implementing IComparable.CompareTo,
  ''' sorting Name within Salary within Role.
  ''' </summary>
  ''' <param name="other"> Compared with clsEmployee. </param>
  ''' <returns>
  ''' Negative 1 if Me less-than Other.
  ''' Positive 1 if Me greater-than Other.
  ''' Zero if Me equal-to Other.
  ''' </returns>
  ''' <remarks>
  ''' Devised for testing general QuickSort methods.
  ''' DirectCase is used because it is a compile-time cast,
  ''' i.e. does not incur run-time overhead. This is a
  ''' more complex than usual comparison.
  ''' </remarks>
  Public Overrides Function CompareTo(ByVal other As clsBaseIComparable) As Integer
    If Me.fRole < DirectCast(other, clsEmployee).fRole Then
      Return -1
    ElseIf Me.fRole > DirectCast(other, clsEmployee).fRole Then
      Return +1
    ElseIf Me.fSalary < DirectCast(other, clsEmployee).fSalary Then
      Return -1
    ElseIf Me.fSalary > DirectCast(other, clsEmployee).fSalary Then
      Return +1
    ElseIf Me.fLastName < DirectCast(other, clsEmployee).fLastName Then
      Return -1
    ElseIf Me.fLastName > DirectCast(other, clsEmployee).fLastName Then
      Return +1
    ElseIf Me.fFirstName < DirectCast(other, clsEmployee).fFirstName Then
      Return -1
    ElseIf Me.fFirstName > DirectCast(other, clsEmployee).fFirstName Then
      Return +1
    Else
      Return 0
    End If
  End Function

End Class

Not too grueling so far. Now, for the QuickSort. All that has been added to the "Stable QuickSort" implementation is an xComp variable to avoid multiple calls to CompareTo, and the use of CompareTo in place of greater than and less than:

VB
Public Shared Sub QuickSortEmployee _
      (ByRef awEmpArr As clsEmployee(), ByRef awRefArr As Int32(), _
       ByVal aStartIndex As Int32, ByVal aEndIndex As Int32)
  Dim xTmp As Int32                     ' Temp integer.
  Dim xLo As Int32 = aStartIndex        ' Working lower bound index.
  Dim xHi As Int32 = aEndIndex          ' Working upper bound index.
  Dim xPivot As Int32 = awRefArr((xLo + xHi) \ 2) ' Pivot value.
  Dim xComp As Int32                    ' CompareTo value.

  ' QuickSort.
  Do Until (xLo > xHi)

    ' Bump our Low Index while items are already in place.
    ' Keep a stable order.
    xComp = awEmpArr(awRefArr(xLo)).CompareTo(awEmpArr(xPivot))
    Do While  (xComp < 0) OrElse ((xComp = 0) AndAlso (awRefArr(xLo) < xPivot))
      xLo += 1
      xComp = awEmpArr(awRefArr(xLo)).CompareTo(awEmpArr(xPivot))
    Loop

    ' Bump our High Index while items are already in place.
    ' Keep a stable order.
    xComp = awEmpArr(awRefArr(xHi)).CompareTo(awEmpArr(xPivot))
    Do While (xComp > 0) OrElse ((xComp = 0) AndAlso (awRefArr(xHi) > xPivot))
      xHi -= 1
      xComp = awEmpArr(awRefArr(xHi)).CompareTo(awEmpArr(xPivot))
    Loop

    ' Swap the low and high index items, if swappable. Bump indices.
    If xLo <= xHi Then
      xTmp = awRefArr(xLo)
      awRefArr(xLo) = awRefArr(xHi)
      awRefArr(xHi) = xTmp
      xLo += 1
      xHi -= 1
    End If

  Loop

  ' QuickSort the top partition.
  If xLo < aEndIndex Then QuickSortEmployee(awEmpArr, awRefArr, xLo, aEndIndex)
  ' QuickSort the bottom partition.
  If xHi > aStartIndex Then QuickSortEmployee(awEmpArr, awRefArr, aStartIndex, xHi)

End Sub

Notes

The generalised QuickSort takes considerably longer to execute. The one included in my samples takes more than five times as long (on my machine) as "Raw QuickSort", probably for two reasons:

  1. It is calling an external method;
  2. It is a far more complicated comparison.

I am often encouraged by these disadvantages to instead implement multiple, customised QuickSorts, which often cost little extra coding and execute many times faster.

Summary

QuickSort is clearly a simple, succinct way of sorting data. The code footprint is of such small size that consideration should be given to replicating QuickSort code for different 'sorts' rather than excessively generalising it. It has been shown that it can be a stable sort as well as an easily generalised sort, sorting data directly in place of indirectly by reference. The uncluttered QuickSort is basically (pun intended) an elegant piece of logic – congratulations C. A. R. Hoare (1960).

History

  • 2008-01-10 - Original article.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)