Introduction
sorted_vector
adapts a std::vector
to the interface required by
std::set/std::multiset
, thereby providing set/multiset and vector
functionality (random access) in one container.
Tests show that sorted_vector
's element retrieval (find) speed outperforms
that of std::set/std::multiset
by a factor of 2 (on most machines). On the downward side is
the poor performance of sorted_vector
's insert member function, because it
inserts an element somewhere in the middle of the sorted vector in order
to preserve the sort order. By using sorted_vector
's low level interface one can
solve this performance bottleneck by inserting a bunch of objects using a series of
push_back
's followed by a call to the member function sort
, which restores the sort
order.
sorted_vector
should be preferred over std::set/std::multiset
if many small elements
must be stored. Most STL implementations use a variant of a balanced tree to
implement set and multiset, which imposes a per element storage overhead of 3 pointers.
The most important reason to use a std::set/std::multiset
instead of
sorted_vector
is
the need to keep iterators into a set/multiset while inserting more elements into
the set. These iterators remain valid in the case of a set/multiset, but are invalidated
in the case of a sorted_vector
container.
Namespace
sorted_vector
resides in the namespace codeproject
.
Basic Usage
The following small table shows corresponding declarations of sorted_vector
's
and set
's/multiset
's.
STL concept |
std library |
sorted_vector |
set |
set<Key,Pred> |
sorted_vector<Key,true,Pred> |
multiset |
mutliset<Key,Pred> |
sorted_vector<Key,Pred> |
(sorted_vector<Key,Pred,true>
means sorted_vector<Key,Pred,bNoDuplicates=true>
)
The various set/multiset insert
and erase
member functions as well as the
set/multiset functions count, lower_bound,upper_bound, equal_range
are part of
sorted_vector
's interface and behave the same as there set/multiset counterparts.
The following code snippet shows the use of a sorted_vector
to hold the
set intersection of the content of two std::vector
's.
#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t
BuildIntersection( const std::vector<int>& v0,const std::vector<int>& v1,
codeproject::sorted_vector<int,true>& svecIntersection)
{
codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
for(int i=0;i<v1.size();i++){
if( svec.find(v1[i])!=svec.end() ){
svecIntersection.insert(v1[i]);
}
}
return svecIntersection.size();
}
The code example shows the use of the member functions find
and insert
.
If you replace codeproject::sorted_vector<int,true>
by std::set<int>
you get exactly
the same result. This piece of code can be optimized for speed by replacing the insert calls in the loop
by calls to push_back
of the base container (a vector) followed by a call
to the member function sort
at the end of the loop.
#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t
BuildIntersection1( const std::vector<int>& v0,const std::vector<int>& v1,
codeproject::sorted_vector<int,true>& svecIntersection)
{
codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
codeproject::sorted_vector<int,true>::Cont& vInterSect
= svecIntersection.get_container();
for(int i=0;i<v1.size();i++){
if( svec.find(v1[i])!=svec.end() ){
vInterSect.push_back(v1[i]);
}
}
svecIntersection.sort();
return svecIntersection.size();
}
Interface of sorted_vector
Member
|
Coming from
|
Description
|
Cont |
sorted_vector |
container type, the type of the container used to store the controlled sequence. |
value_type |
vector |
The type of object, T, stored in the set. |
key_type |
set/multiset |
The key type associated with value_type. |
key_compare |
set/multiset |
Function object that compares two keys for ordering. |
value_compare |
set/multiset |
Function object that compares two values for ordering. |
pointer |
vector |
Pointer to T. |
reference |
vector |
Reference to T |
const_reference |
vector |
Const reference to T |
size_type |
vector |
An unsigned integral type. |
difference_type |
vector |
A signed integral type. |
iterator |
vector |
Random access iterator used to iterate through a vector. |
const_iterator |
vector |
Random access const iterator used to iterate through a vector |
reverse_iterator |
vector |
Random access iterator used to iterate backwards through a vector.
|
const_reverse_iterator |
vector |
Random access const iterator used to iterate backwards through a vector. |
iterator begin() const |
vector |
Returns an iterator pointing to the beginning of the sorted_vector. |
iterator end() const |
vector |
Returns an iterator pointing to the end of the sorted_vector. |
reverse_iterator rbegin() const |
vector |
Returns a reverse_iterator pointing to the beginning of the reversed sorted_vector |
reverse_iterator rend() const |
vector |
Returns a reverse_iterator pointing to the end of the reversed sorted_vector. |
size_type size() const |
vector |
Returns the size of the sorted_vector. |
size_type max_size() const |
vector |
Returns the largest possible size of the sorted_vector. |
bool empty() const |
vector |
true if the sorted_vec's size is 0. |
key_compare key_comp() const |
set/multiset |
Returns the key_compare object used by the sorted_vector. |
value_compare value_comp() const |
set/multiset |
Returns the value_compare object used by the sorted_vector. |
sorted_vector() |
set/multiset |
Creates an empty sorted_vector. |
sorted_vector(const key_compare& comp) |
set/multiset |
Creates an empty sorted_vector, using comp as the key_compare object. |
VC++7.0: template <class InputIterator> sorted_vector(InputIterator f, InputIterator l)
VC++6.0: sorted_vector(const_iterator f, const_iterator l)
|
set/multiset |
Creates a sorted_vector with a copy of a range. |
VC++7.0: template <class InputIterator> sorted_vector(InputIterator f, InputIterator l,const key_compare& comp)
VC++6.0: sorted_vector(const_iterator f, const_iterator l,const key_compare& comp)
|
set/multiset |
Creates a sorted_vector with a copy of a range, using comp as the key_compare object. |
sorted_vector(const sorted_vector&) |
set/multiset |
The copy constructor. |
sorted_vector& operator=(const sorted_vector&) |
set/multiset |
The assignment operator (assigns other sorted_vector ) |
sorted_vector& operator=(const Cont&) |
sorted_vector |
The assignment operator (assigns other vector<T> ) |
void swap(sorted_vector&) |
set/multiset |
Swaps the contents of two sorted_vector's |
pair<iterator, bool>insert(const value_type& x) |
set/multiset |
Inserts x into the sorted_vector. |
iterator insert(iterator pos, const value_type& x) |
set/multiset |
Inserts x into the sorted_vector, using pos as a hint to where it will be
inserted.
|
VC++7.0: template <class InputIterator> void insert(InputIterator f, InputIterator l)
VC++6.0: void insert(const_iterator first f, const_iterator l)
|
set/multiset |
Inserts a range into the sorted_vector. |
void erase(iterator pos) |
vector |
Erases the element pointed to by pos. |
size_type erase(const key_type& k) |
set/multiset |
Erases the element whose key is k. |
void erase(iterator first, iterator last) |
vector |
Erases all elements in a range. |
void clear() |
vector |
Erases all of the elements. |
iterator find(const key_type& k) const |
set/multiset |
Finds an element whose key is k. |
size_type count(const key_type& k) const |
set/multiset |
Counts the number of elements whose key is k. |
iterator lower_bound(const key_type& k) const |
set/multiset |
Finds the first element whose key is not less than k. |
iterator upper_bound(const key_type& k) const |
set/multiset |
Finds the first element whose key is greater than k. |
pair<iterator, iterator> equal_range(const key_type& k) const |
set/multiset |
Finds a range containing all elements whose key is k. |
bool operator==(const sorted_vector&,const sorted_vector&) |
vector |
Tests two sorted_vector for equality. This is a global function, not a member function.
There is also an operator != |
bool operator<(const sorted_vector&, const sorted_vector&) |
vector |
Lexicographical comparison. This is a global function, not a member function.
There are also operators <= >= and > |
Cont& get_container() |
sorted_vector |
Returns a reference to the internal vector used to store the controlled sequence |
void sort() |
sorted_vector |
Restores the sort order using key_compare |
reference at(size_type i) |
sorted_vector |
Returns a reference to the element at *(begin()+i) with range check |
const_reference at(size_type i) const |
sorted_vector |
Returns a reference to the element at *(begin()+i) with range check |
const_reference operator[](size_type i) const |
sorted_vector |
Returns a reference to the element at *(begin()+i) |
reference operator[](size_type i) |
sorted_vector |
Returns a reference to the element at *(begin()+i) |
const_reference front() const |
sorted_vector |
Returns a reference to the first element |
reference front() |
sorted_vector |
Returns a reference to the first element |
reference back() |
sorted_vector |
Returns a reference to the last element |
const_reference back() const |
sorted_vector |
Returns a reference to the last element |
void pop_back() |
sorted_vector |
Removes the last element from the sorted_vector |
Points of Interest
An interview with Alexander Stepanov (inventor of the STL) at
http://www.stlport.org/resources/StepanovUSA.html
, in which he proposed to implement
a sorted container as an adapter of a std::container
gave me the idea for this work.
The real surprise to me was the unexpected good performance of sorted_vector
compared
to set/multiset. The outcome of my tests indicate, that the std::set/std::multiset
has only one (important) advantage over a sorted vector, namely that iterators remain
valid in a set/multiset, even when more elements are added.
The implementation work itself was easy and did not require any advanced C++/STL feature.
It can be summarized as follows:
Most of the member functions of sorted_vector
forward the call to the vec_
data member,
which is a std::vector
. The set/multiset specific functions for inserting/erasing and
locating elements mostly use STL algorithms to deal with the sorted sequence present
in the vec_
data member. The low level interface consisting of the member functions
get_container()
(which returns a reference to the vec_
data member)
and sort
and stable_sort
(which restore the sort order) are necessary to improve insertion performance (by allowing
a temporary violation of the sorting order).
Only one function was unexpectedly difficult to implement: The function unique
, which
must be called after sort
and stable_sort
in the case of a set.
This function is part of the STL and requires a predicate as third argument. This predicate must return true,
if the passed elements are equal. But the class sorted_vector
only has access to a predicate
which evaluates the < relation. In theory, it should be possible, to transform a predicate
evaluating the < relation into another predicate evaluating ==, but in practice this is only
possible, when the < predicate is implemented as object derived from std::binary_function
.
Ultimately, I had to implement unique
myself because of that.
History
- 1.0: November 19th, 2002; Initial release.
- 1.1: November 28th, 2002; Documentation and code Update
- changed namespace from
std
to codeproject
- supports member templates for constructing/inserting from iterator range in case of VC++7.0