Table of Contents
Introduction
This article is a crash course on the STL's std::sort
comparison functions. I have an ex-colleague who used std::sort
to sort an array of strings in ascending order and then copied the sorted array into another array in a reverse order, in order to achieve sorting in a descending order. His method is a waste of processor cycles and memory. He could have achieved sorting in a descending order by using the std::greater<string>
function object defined in the functional
header or writing a comparison function for predicate version of std::sort
. That incident prompted me to write this article. This article is a quick guide for programmers looking for information on how to write comparison function for std::sort
. Without further ado, let us begin!
How To Use std::sort
template <class RandomAccessIterator>
void sort ( RandomAccessIterator first, RandomAccessIterator last );
template <class RandomAccessIterator, class Compare>
void sort ( RandomAccessIterator first, RandomAccessIterator last, Compare comp );
Let's do a quick refresh on how to use std::sort
on arrays and containers like std::tr1::array, std::vector
and std::deque
. For sorting the array in ascending order, you need to specify the address of the first element of the array for the first parameter and the address of one past the last element of the array for the second parameter. Below is an example:
#include <algorithm>
#include <iostream>
#include <string>
int main()
{
int arr[5];
arr[0] = 2;
arr[1] = 3;
arr[2] = 5;
arr[3] = 1;
arr[4] = 4;
std::sort(arr,arr+5);
std::cout<<arr[0]<<"\n";
std::cout<<arr[1]<<"\n";
std::cout<<arr[2]<<"\n";
std::cout<<arr[3]<<"\n";
std::cout<<arr[4]<<std::endl;
return 0;
}
The output is as follows:
1
2
3
4
5
To achieve descending order, include the functional header and replace std::sort(arr,arr+5);
line with std::sort(arr,arr+5,std::greater<int>());
#include <algorithm>
#include <iostream>
#include <string>
#include <functional>
int main()
{
int arr[5];
arr[0] = 2;
arr[1] = 3;
arr[2] = 5;
arr[3] = 1;
arr[4] = 4;
std::sort(arr,arr+5,std::greater<int>());
std::cout<<arr[0]<<"\n";
std::cout<<arr[1]<<"\n";
std::cout<<arr[2]<<"\n";
std::cout<<arr[3]<<"\n";
std::cout<<arr[4]<<std::endl;
return 0;
}
The output is as follows:
5
4
3
2
1
For sorting std::tr1::array
, std::vector
or std::deque
in ascending order, you specify the begin()
method which returns the first iterator, as first parameter and end()
method which returns the one iterator past the last iterator, as the second parameter. Below is an example of sorting the std::vector
:
#include <vector>
#include <algorithm>
#include <iostream>
#include <string>
int main()
{
std::vector<int> vec;
vec.push_back(2);
vec.push_back(3);
vec.push_back(5);
vec.push_back(1);
vec.push_back(4);
std::sort(vec.begin(),vec.end());
for(size_t i=0; i<vec.size(); ++i)
std::cout<<vec[i]<<std::endl;
return 0;
}
The output for the above code snippet is as follows:
1
2
3
4
5
Please take note that you cannot use std::sort
to sort std::list
which is a single-linked list because std::sort
requires random access iterators, to work, which std::list
does not have. For std::list
, please use its sort
member function instead.
Global < Operator
std::sort
calls std::less
for its comparison and std::less
, in turn, calls the <
operator of the element type. For POD types, the <
operator is already defined. For your own custom classes, you need to define our own <
operator. There are two ways to go about defining your own <
operator, you can define it as either a global <
operator or member <
operator function of the class. Below is an example of how to define a global <
operator for the Person
class:
#include <vector>
#include <algorithm>
#include <iostream>
#include <string>
class Person
{
public:
Person() : age(0) {}
Person(int age, std::string name) {
this->age = age; this->name = name;
}
int age;
std::string name;
};
inline bool operator<(const Person& a, const Person& b)
{
return a.age < b.age;
}
int main()
{
std::vector<Person> vecPerson;
vecPerson.push_back(Person(24,"Calvin"));
vecPerson.push_back(Person(30,"Benny"));
vecPerson.push_back(Person(28,"Alison"));
std::sort(vecPerson.begin(),vecPerson.end());
for(size_t i=0; i<vecPerson.size(); ++i)
std::cout<<vecPerson[i].age<<", "<<vecPerson[i].name<<std::endl;
return 0;
}
The output is as below:
24, Calvin
28, Alison
30, Benny
Member < Operator
Next is an example of how to define a member < operator for the Person
class:
#include <vector>
#include <algorithm>
#include <iostream>
#include <string>
class Person
{
public:
Person() : age(0) {}
Person(int age, std::string name) {
this->age = age; this->name = name;
}
bool operator<(const Person& rhs)
{
return this->age < rhs.age;
}
int age;
std::string name;
};
int main()
{
std::vector<Person> vecPerson;
vecPerson.push_back(Person(24,"Calvin"));
vecPerson.push_back(Person(30,"Benny"));
vecPerson.push_back(Person(28,"Alison"));
std::sort(vecPerson.begin(),vecPerson.end());
for(size_t i=0; i<vecPerson.size(); ++i)
std::cout<<vecPerson[i].age<<", "<<vecPerson[i].name<<std::endl;
return 0;
}
The output is the same as the global < operator example:
24, Calvin
28, Alison
30, Benny
You might ask if there is any advantage for the member operator function over the global operator function. Yes, there is! A member operator function can access the private
and protected
data members of the class whereas a global operator function cannot. If age in the Person
class is a private
member, global operator functions cannot access it without using an accessor function.
A Function Pointer as Comparison Function
What if we want to sort vector of Person
by descending order? There are 2 ways to go about it: we can either define a comparison function and pass its function pointer to the predicate version of std::sort
, or define a comparison function object (also known as functors) and pass it to the predicate version of std::sort
as a comparison predicate.
If we are interested in knowing who are the seniors in age, we may sort by age descending and followed by name ascending. This is custom sorting, as opposed to sorting in descending order. Below is an example, using function pointer as a sorting predicate.
#include <vector>
#include <algorithm>
#include <iostream>
#include <string>
class Person
{
public:
Person() : age(0) {}
Person(int age, std::string name) {
this->age = age; this->name = name;
}
int age;
std::string name;
};
bool Greater(const Person& a, const Person& b)
{
if(a.age == b.age)
return a.name < b.name;
return a.age > b.age;
}
int main()
{
std::vector<Person> vecPerson;
vecPerson.push_back(Person(24,"Calvin"));
vecPerson.push_back(Person(30,"Benny"));
vecPerson.push_back(Person(30,"Alice"));
vecPerson.push_back(Person(28,"Alison"));
std::sort(vecPerson.begin(),vecPerson.end(),Greater);
for(size_t i=0; i<vecPerson.size(); ++i)
std::cout<<vecPerson[i].age<<", "<<vecPerson[i].name<<std::endl;
return 0;
}
The output is as follows:
30, Alice
30, Benny
28, Alison
24, Calvin
A Function Object as Comparison Function
Below is an example of sorting, using function object (which is also known as functor) as a predicate. Function object is actually a class with the ()
operator overloaded. You may read this wikipedia on function object for more information.
#include <vector>
#include <algorithm>
#include <iostream>
#include <string>
#include <functional>
class Person
{
public:
Person() : age(0) {}
Person(int age, std::string name) {
this->age = age; this->name = name;
}
int age;
std::string name;
};
struct GreaterAge : public std::binary_function<Person,Person,bool>
{
inline bool operator()(const Person& a, const Person& b)
{
if(a.age == b.age)
return a.name < b.name;
return a.age > b.age;
}
};
int main()
{
std::vector<Person> vecPerson;
vecPerson.push_back(Person(24,"Calvin"));
vecPerson.push_back(Person(30,"Benny"));
vecPerson.push_back(Person(30,"Alice"));
vecPerson.push_back(Person(28,"Alison"));
std::sort(vecPerson.begin(),vecPerson.end(),GreaterAge());
for(size_t i=0; i<vecPerson.size(); ++i)
std::cout<<vecPerson[i].age<<", "<<vecPerson[i].name<<std::endl;
return 0;
}
This is the output:
30, Alice
30, Benny
28, Alison
24, Calvin
Note: If we do not derive our function object from std::binary_function
class, the above example would work just as well. Below is an example of the above function object not inheriting from any class:
struct GreaterAge
{
inline bool operator()(const Person& a, const Person& b)
{
if(a.age == b.age)
return a.name < b.name;
return a.age > b.age;
}
};
You may ask, between function pointer and function object, which is the preferred choice for writing comparison. The answer is function object because we can inline a function object for better performance whereas we cannot inline a function pointer. This is the reason why std::sort
performs better than C's qsort
. As C++ programmers, we should always use std::sort
over qsort
because qsort
does not have type safety.
Boost Lambda Library
We can also specify the comparison function as an anonymous function, using Boost Lambda Library or C++0x Lambda. We shall see an example of using the Boost Lambda as an anonymous comparison function for sorting an array of integers in descending order. We need to specify the return type of the lambda as boolean. _1
and _2
are the 2 integer arguments passed to the Lambda comparison function. Note: as seen in the earlier second code example in this article, we can achieve the same thing, using greater<int>
predicate.
#include <algorithm>
#include <iostream>
#include <string>
#include <cstdlib>
#include <functional>
#include <boost/lambda/lambda.hpp>
int main()
{
int arr[5];
arr[0] = 2;
arr[1] = 3;
arr[2] = 5;
arr[3] = 1;
arr[4] = 4;
using namespace boost::lambda;
std::sort(arr,arr+5, ret<bool>(_1 > _2));
std::cout<<arr[0]<<"\n";
std::cout<<arr[1]<<"\n";
std::cout<<arr[2]<<"\n";
std::cout<<arr[3]<<"\n";
std::cout<<arr[4]<<std::endl;
return 0;
}
The output is as follows:
5
4
3
2
1
In the second Boost Lambda example, we are going to sort a vector of Person
objects in ascending order.
#include <vector>
#include <algorithm>
#include <iostream>
#include <string>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/casts.hpp>
#include <boost/function/function_base.hpp>
class Person
{
public:
Person() : age(0) {}
Person(int age, std::string name) {
this->age = age; this->name = name;
}
int age;
std::string name;
};
int main()
{
std::vector<Person> vecPerson;
vecPerson.push_back(Person(24,"Calvin"));
vecPerson.push_back(Person(30,"Benny"));
vecPerson.push_back(Person(28,"Alison"));
using namespace boost::lambda;
std::sort(vecPerson.begin(),vecPerson.end(),
ret<bool>( (&_1 ->* &Person::age) < (&_2 ->* &Person::age) ) );
for(size_t i=0; i<vecPerson.size(); ++i)
std::cout<<vecPerson[i].age<<", "<<vecPerson[i].name<<std::endl;
return 0;
}
The output is as follows:
24, Calvin
28, Alison
30, Benny
As we can see, this Boost Lambda syntax is more complex. The return type is specified first, followed by expression to compare the Person::age
. To get the value of Person::age
member, we need to put ampersand on the left side of the arguments(_1
and _2
) to convert it into an pointer so that we can use the pointer-to-member(->*
) operator to access Person::age
member. Ampersand is also put on the left of Person::age
to get its address. If your vanilla comparison function does more than comparing class members, you will have a hard time in converting it to a Boost Lambda function because you have to use Boost Lambda's unnatural constructs to do if
-else
, switch
-case
, while
-loop
and so on. For more information on Boost Lambda, please refer to its documentation here. Next, let us look at the C++0x Lambda.
C++0x Lambda
Lambda is a language feature in the new upcoming C++ standard. To try out the C++0x Lambda examples listed below, you need to install Visual Studio 2010 Beta 1. You can download it here. You are advised to install Visual Studio 2010 Beta 1 in Virtual PC so as not to disrupt your current Visual Studio installation. You can download Virtual PC's OS images here. For more instructions on how to install Visual Studio 2010 Beta 1, you can refer to this website.
Let us look at the example of using the C++0x Lambda as an anonymous comparison function for sorting an array of integers in descending order.
#include <algorithm>
#include <iostream>
#include <string>
#include <cstdlib>
int main()
{
int arr[5];
arr[0] = 2;
arr[1] = 3;
arr[2] = 5;
arr[3] = 1;
arr[4] = 4;
std::sort(arr,arr+5, [](int x, int y){return x > y;});
std::cout<<arr[0]<<"\n";
std::cout<<arr[1]<<"\n";
std::cout<<arr[2]<<"\n";
std::cout<<arr[3]<<"\n";
std::cout<<arr[4]<<std::endl;
return 0;
}
[]
denotes the start of C++0x lambda. x
and y
are the arguments to the lambda. We are free to name these 2 arguments as we like, unlike in Boost Lambda, it has to be _1
and _2
. We did not specify the return type of the lambda, it is automatically inferred from the only return statement. We can get away without specifying the return type of lambda but we are only restricted to 1 expression which is the return statement.
The output is as follows:
5
4
3
2
1
In the second C++0x Lambda example, we are going to sort a vector of Person
objects with the age
value in descending order and if the 2 age
values are equal, then it is resolved by sorting the name in ascending order. Since this lambda has more than 1 statement, we need to specify its boolean return type, using this syntax, -> bool
.
#include <vector>
#include <algorithm>
#include <iostream>
#include <string>
class Person
{
public:
Person() : age(0) {}
Person(int age, std::string name) {
this->age = age; this->name = name;
}
int age;
std::string name;
};
int main()
{
std::vector<Person> vecPerson;
vecPerson.push_back(Person(24,"Calvin"));
vecPerson.push_back(Person(30,"Benny"));
vecPerson.push_back(Person(30,"Alice"));
vecPerson.push_back(Person(28,"Alison"));
std::sort(vecPerson.begin(),vecPerson.end(),
[](const Person& a, const Person& b) -> bool
{
if(a.age == b.age)
return a.name < b.name;
return a.age > b.age;
});
for(size_t i=0; i<vecPerson.size(); ++i)
std::cout<<vecPerson[i].age<<", "<<vecPerson[i].name<<std::endl;
return 0;
}
We can afford to do more with C++0x Lambda than Boost Lambda because the C++0x Lambda body syntax (refers to code enclosed in the curly parenthesis) is normal C++ syntax, therefore it is more natural and is easy to understand. For more information on C++0x Lambda syntax, please refer to this page.
The output is as follows:
30, Alice
30, Benny
28, Alison
24, Calvin
Boost Lambda is implemented as a library while C++0x Lambda is implemented as a language feature. This explains why Boost Lambda syntax is convoluted, compared to C++0x Lambda syntax, due to workarounds to get over the C98 C++ language limitations.
qsort
Since we are on the topic of comparison function, let me show you briefly how to write the comparison function for qsort
. For sorting by ascending order, the comparison function has to return 0 if the 2 arguments are equal; it should return greater than zero if the first argument is greater than the second argument and return lesser than zero if the first argument is lesser than the second argument. Of course, you would reverse the return values if you want sorting by descending order. Below is an example of the qsort
comparison function. For more information, please read this MSDN link:
#include <iostream>
#include <string>
#include <cstdlib>
int compare(const void* x, const void* y)
{
int* a = (int*)x;
int* b = (int*)y;
if(*a==*b)
return 0;
return *a > *b ? +1 : -1;
}
int main()
{
int arr[5];
arr[0] = 2;
arr[1] = 3;
arr[2] = 5;
arr[3] = 1;
arr[4] = 4;
std::qsort(arr,5,sizeof(int),compare);
std::cout<<arr[0]<<"\n";
std::cout<<arr[1]<<"\n";
std::cout<<arr[2]<<"\n";
std::cout<<arr[3]<<"\n";
std::cout<<arr[4]<<std::endl;
return 0;
}
The output is as follows:
1
2
3
4
5
std::stable_sort
template <class RandomAccessIterator>
void stable_sort ( RandomAccessIterator first, RandomAccessIterator last );
template <class RandomAccessIterator, class Compare>
void stable_sort ( RandomAccessIterator first, RandomAccessIterator last,
Compare comp );
Since we are on the topic on std::sort
, I'll touch briefly on std::stable_sort
and other STL sort algorithms. The usage of std::stable_sort
is the same as std::sort
. The only difference is std::stable_sort
will respect the original order of the elements which are equal whereas std::sort
does not. std::stable_sort
performs this trick at the expense of some performance because there are more comparisons made in order to determine if any 2 elements are equal. The strange thing is std::stable_sort
does not require the element type to have the equality operator(==
) defined; std::stable_sort
determines if 2 elements are equal by using this expression !(x<y) && !(y<x)
. The comparison function is the same as std::sort
. For more information on std::stable_sort
, you may read here.
std::partial_sort
If one want a sorted range, there is no need to sort the whole collection. This is where partial_sort
comes into picture. (For example, boss only cares about the top 3 salesmen)
template <class RandomAccessIterator>
void partial_sort ( RandomAccessIterator first, RandomAccessIterator middle,
RandomAccessIterator last );
template <class RandomAccessIterator, class Compare>
void partial_sort ( RandomAccessIterator first, RandomAccessIterator middle,
RandomAccessIterator last, Compare comp );
std::partial_sort
rearranges elements in such a way that the subrange [first,middle) contains the smallest elements of the entire range sorted in ascending order, and the subrange [middle,end) contains the remaining elements without any specific order. The comparison function is the same as std::sort
. For more information on std::partial_sort
, you may read here.
std::partial_sort_copy
template <class InputIterator, class RandomAccessIterator>
RandomAccessIterator
partial_sort_copy ( InputIterator first,InputIterator last,
RandomAccessIterator result_first,
RandomAccessIterator result_last );
template <class InputIterator, class RandomAccessIterator, class Compare>
RandomAccessIterator
partial_sort_copy ( InputIterator first,InputIterator last,
RandomAccessIterator result_first,
RandomAccessIterator result_last, Compare comp );
std::partial_sort_copy
sorts the same way as std::partial_sort
except it would copy the results to [results_first
,results_last
). The comparison function is the same as std::sort
. For more information on std::partial_sort_copy
, you may read this link.
std::nth_element
If one is only interested in sorted value of a single position, he can use nth_element
. Other elements may or may not be sorted.
template <class RandomAccessIterator>
void nth_element (RandomAccessIterator first, RandomAccessIterator nth,
RandomAccessIterator last);
template <class RandomAccessIterator, class Compare>
void nth_element (RandomAccessIterator first, RandomAccessIterator nth,
RandomAccessIterator last, Compare comp);
Conclusion
In this article, we have looked at how to use std::sort
, how to define a <
operators and the custom comparison function as either a function pointer or a function object or lambdas. We have also looked briefly at qsort
, std::stable_sort
, std::partial_sort
and std::partial_sort_copy
. If you like this article or articles on STL, I can write more articles on useful STL algorithms in the future. Thank you for reading!
History
- 2nd October, 2016: Added
nth_element
section and explanation in partial_sort
section - 3rd August, 2009: Added Boost Lambda and C++0x Lambda sections
- 21st July, 2009: First release