C++: InsertionSort Outperforms QuickSort on Almost In-Order Array

Shao Voon Wong

5.00/5 (7 votes)

31 Mar 2024CPOL3 min read

7.8K

InsertionSort outperforms QuickSort on almost In-Order Array

InsertionSort performs better than QuickSort on almost In-Order Array and examines the caveats to be aware of when utilizing InsertionSort.

This short writeup is a tip that demonstrates performance of insertion sort over quick sort on sorting almost in-order array. As such, it is not eligible to participate in the CodeProject monthly article competition.

Download BenchmarkInsertionSort2.zip - 4.6 KB

Performance Characteristics
Benchmark Code
Benchmark Results
Related Article
References
History

Performance Characteristics

The average time complexity of BubbleSort, SelectionSort and InsertionSort is O(N²) and QuickSort is O(N.Log N). When it comes to sorting almost sorted array, QuickSort's performance is degraded to O(N²) and InsertionSort's runtime complexity is promoted to O(N). Below is a list of time complexity where the best to the worst is ranked from left to right. O(N²) is the most undesirable algorithmic runtime. This reason led to C++17's Standard Library to implement std::sort based on Introsort. Unlike QuickSort, Introsort has a consistent runtime performance of O(N.Log N) but it still loses out to InsertionSort's O(N) on sorting almost in-order container.

O(1) < O(log N) < O(N) < O(N.log N) < O(N^2)

The use case for InsertionSort is an array in your application is required to be maintained in-order and from time to time, a new element is appended to it. A perfect example is the function to calculate median from a sorted array of elements. As an optimization, this array does not need to be sorted whenever an element is appended. Sorting is only required during median retrieval and new elements have been appended since the last retrieval. Obviously, the situation of too many unsorted element addition turning InsertionSort runtime from O(N) back to O(N²) must be avoided at all costs. Set a threshold when a certain ratio of appended elements to array size is reached, the InsertionSort must be triggered.

The BubbleSort, SelectionSort and InsertionSort routines used in this tip are listed as follows:

C++

template<typename T, typename Predicate>
void BubbleSort(T* arr, int size, Predicate compare)
{
    for (int i = size - 1; i > 0; --i)
    {
        for (int j = 0; j < i; ++j)
        {
            if (compare(arr[j + 1], arr[j]))
            {
                T temp = arr[j];
                arr[j] = arr[j + 1];
                arr[j + 1] = temp;
            }
        }
    }
}

template<typename T, typename Predicate>
void SelectionSort(T* arr, int size, Predicate compare)
{
    for (int i = 0; i < size; ++i)
    {
        int minIndex = i;
        for (int j = i + 1; j < size; ++j)
        {
            if (compare(arr[j], arr[minIndex]))
                minIndex = j;
        }
        if (i != minIndex)
        {
            T temp = arr[i];
            arr[i] = arr[minIndex];
            arr[minIndex] = temp;
        }
    }
}

template<typename T, typename Predicate>
void InsertionSort(T* arr, int size, Predicate compare)
{
    // InsertionSort starts from 1, the second item.
    for (int i = 1; i < size; ++i)
    {
        T temp = arr[i];
        int j = i - 1;
        while (j > -1 && compare(temp, arr[j]))
        {
            arr[j + 1] = arr[j];
            arr[j] = temp;
            --j;
        }
    }
}

// Optimized Insertion Sort by Rehorav. Member link is below
// https://www.codeproject.com/script/Membership/View.aspx?mid=4828351
template<typename T, typename Predicate>
void InsertionSort2(T* arr, int size, Predicate compare)
{
    for (int i = 1; i < size; ++i)
    {
        T temp = arr[i];
        int j = i;

        // Shift elements to the right to make room for temp
        while (j > 0 && compare(temp, arr[j - 1]))
        {
            arr[j] = arr[j - 1];
            --j;
        }

        // Insert temp at its correct position
        arr[j] = temp;
    }
}

Benchmark Code

The benchmark setup is as follows. The source vector, vec, contains 100000 elements which are initialized from 1 to 100000. And 5 elements are removed and appended to the vector end so as to create an almost in-order array.

C++

const size_t VEC_SIZE = 100000;
std::vector<int> vec(VEC_SIZE);
std::iota(vec.begin(), vec.end(), 1);

// Erase these elements from vec
vec.erase(std::remove(vec.begin(), vec.end(), 50000), vec.end());
vec.erase(std::remove(vec.begin(), vec.end(), 89), vec.end());
vec.erase(std::remove(vec.begin(), vec.end(), 14568), vec.end());
vec.erase(std::remove(vec.begin(), vec.end(), 78962), vec.end());
vec.erase(std::remove(vec.begin(), vec.end(), 60022), vec.end());

// Then append them back to vec
vec.push_back(50000);
vec.push_back(89);
vec.push_back(14568);
vec.push_back(78962);
vec.push_back(60022);

std::vector<int> temp(VEC_SIZE);

The benchmark code is as follows. At each iteration, vec is assigned to the temp vector and the sorting is done on the temp vector. Each benchmark sorts temp in ascending order.
Note: The assert macro expands to nothing in release build so they do not negatively affect the release mode result. As a note of interest, I tried to add my home-made QuickSort to the benchmark but it resulted in stack overflow due to its recursive calls. My guess is the C++ Standard Library's quick sort routines are iterative in nature because they do not have the stack overflow problem.

C++

timer cStopWatch;
cStopWatch.start("C qsort Benchmark");
for (size_t k = 0; k < MAX_LOOP; ++k)
{
    temp = vec;

    assert(std::is_sorted(temp.cbegin(), temp.cend()) == false);

    qsort((void*)temp.data(), temp.size(), sizeof(temp[0]), comparator);

    assert(std::is_sorted(temp.cbegin(), temp.cend()));
}
cStopWatch.stop();

timer cppStopWatch;
cppStopWatch.start("C++ sort Benchmark");
for (size_t k = 0; k < MAX_LOOP; ++k)
{
    temp = vec;

    assert(std::is_sorted(temp.cbegin(), temp.cend()) == false);

    std::sort(temp.begin(), temp.end());

    assert(std::is_sorted(temp.cbegin(), temp.cend()));
}
cppStopWatch.stop();

timer cppParStopWatch;
cppParStopWatch.start("C++ par sort Benchmark");
for (size_t k = 0; k < MAX_LOOP; ++k)
{
    temp = vec;

    assert(std::is_sorted(temp.cbegin(), temp.cend()) == false);

    std::sort(std::execution::par, temp.begin(), temp.end());

    assert(std::is_sorted(temp.cbegin(), temp.cend()));
}
cppParStopWatch.stop();

timer insertionSortStopWatch;
insertionSortStopWatch.start("Insertion Sort Benchmark");
for (size_t k = 0; k < MAX_LOOP; ++k)
{
    temp = vec;

    assert(std::is_sorted(temp.cbegin(), temp.cend()) == false);

    InsertionSort(temp.data(), (int)temp.size(), [](int a, int b) {
        return a < b;
        });

    assert(std::is_sorted(temp.cbegin(), temp.cend()));
}
insertionSortStopWatch.stop();

timer bubbleSortStopWatch;
bubbleSortStopWatch.start("Bubble Sort Benchmark");
for (size_t k = 0; k < MAX_LOOP; ++k)
{
    temp = vec;

    assert(std::is_sorted(temp.cbegin(), temp.cend()) == false);

    BubbleSort(temp.data(), (int)temp.size(), [](int a, int b) {
        return a < b;
        });

    assert(std::is_sorted(temp.cbegin(), temp.cend()));
}
bubbleSortStopWatch.stop();

timer selectionSortStopWatch;
selectionSortStopWatch.start("Selection Sort Benchmark");
for (size_t k = 0; k < MAX_LOOP; ++k)
{
    temp = vec;

    assert(std::is_sorted(temp.cbegin(), temp.cend()) == false);

    SelectionSort(temp.data(), (int)temp.size(), [](int a, int b) {
        return a < b;
        });

    assert(std::is_sorted(temp.cbegin(), temp.cend()));
}
selectionSortStopWatch.stop();

Benchmark Results

This is the benchmark result on 1000 iterations of sorting of arrays of 100000 elements. As you can see, InsertionSort performs even better than parallel std::sort shown in Visual C++ results. VC++'s qsort and std::sort could be implemented using QuickSort and IntroSort. Remember, on almost sorted array, QuickSort has the worst performance of O(N²) while IntroSort is O(N.Log N) and InsertionSort has the best performance of O(N). The C++ Standard Library on Ubuntu 2022 has not yet implemented the parallelized version of std::sort as the serial and parallel results are largely in the same ballpark. The BubbleSort and SelectionSort benchmark are commented out because they took too long to complete. The code is benchmarked on Intel i7 8700 with 6 physical cores and 12 logical cores and 16GB memory.

VC++ 2022

                 C qsort Benchmark: 7881ms
                C++ sort Benchmark: 1105ms
            C++ par sort Benchmark:  418ms
          Insertion Sort Benchmark:  279ms
Insertion Sort Optimized Benchmark:  206ms

gcc version 11.4.0

                 C qsort Benchmark: 2394ms
                C++ sort Benchmark: 2782ms
            C++ par sort Benchmark: 2884ms
          Insertion Sort Benchmark:  451ms
Insertion Sort Optimized Benchmark:  354ms

clang version 14.0.0

                 C qsort Benchmark: 2573ms
                C++ sort Benchmark: 2774ms
            C++ par sort Benchmark: 2513ms
          Insertion Sort Benchmark:  383ms
Insertion Sort Optimized Benchmark:  351ms

To give the reader the perspective of magnitude of time complexity difference of BubbleSort and SelectionSort's O(N²) versus InsertionSort's O(N). I ran the 10 iterations instead of 1000 iterations. Below are the results:

VC++ 2022

       C qsort Benchmark:   83ms
      C++ sort Benchmark:   12ms
  C++ par sort Benchmark:    4ms
Insertion Sort Benchmark:    2ms
   Bubble Sort Benchmark:26641ms
Selection Sort Benchmark:39899ms

gcc version 11.4.0

       C qsort Benchmark:   74ms
      C++ sort Benchmark:   31ms
  C++ par sort Benchmark:   28ms
Insertion Sort Benchmark:    3ms
   Bubble Sort Benchmark:46323ms
Selection Sort Benchmark:26222ms

clang version 14.0.0

       C qsort Benchmark:   75ms
      C++ sort Benchmark:   27ms
  C++ par sort Benchmark:   28ms
Insertion Sort Benchmark:    3ms
   Bubble Sort Benchmark:17038ms
Selection Sort Benchmark:39072ms

The GCC and clang command to build BenchmarkInsertionSort.cpp is as follows:

g++ BenchmarkInsertionSort.cpp -O3 -std=c++17

clang++ BenchmarkInsertionSort.cpp -O3 -std=c++17

Two C++ Features Win Over C Equivalents

References

History

31^th March, 2024: Added the Optimized Insertion Sort by Rehorav to the benchmark results and the source code downloads.
18^th February, 2024: First release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)