Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

An STL compliant sorted vector

0.00/5 (No votes)
28 Nov 2002 1  
A template container which implements set/multiset functionality using a vector

Introduction

sorted_vector adapts a std::vector to the interface required by std::set/std::multiset, thereby providing set/multiset and vector functionality (random access) in one container.

Tests show that sorted_vector's element retrieval (find) speed outperforms that of std::set/std::multiset by a factor of 2 (on most machines). On the downward side is the poor performance of sorted_vector's insert member function, because it inserts an element somewhere in the middle of the sorted vector in order to preserve the sort order. By using sorted_vector's low level interface one can solve this performance bottleneck by inserting a bunch of objects using a series of push_back's followed by a call to the member function sort, which restores the sort order.

sorted_vector should be preferred over std::set/std::multiset if many small elements must be stored. Most STL implementations use a variant of a balanced tree to implement set and multiset, which imposes a per element storage overhead of 3 pointers. The most important reason to use a std::set/std::multiset instead of sorted_vector is the need to keep iterators into a set/multiset while inserting more elements into the set. These iterators remain valid in the case of a set/multiset, but are invalidated in the case of a sorted_vector container.

Namespace

sorted_vector resides in the namespace codeproject.

Basic Usage

The following small table shows corresponding declarations of sorted_vector's and set's/multiset's.

STL concept std library sorted_vector
set set<Key,Pred> sorted_vector<Key,true,Pred>
multiset mutliset<Key,Pred> sorted_vector<Key,Pred>

(sorted_vector<Key,Pred,true> means sorted_vector<Key,Pred,bNoDuplicates=true>)

The various set/multiset insert and erase member functions as well as the set/multiset functions count, lower_bound,upper_bound, equal_range are part of sorted_vector's interface and behave the same as there set/multiset counterparts. The following code snippet shows the use of a sorted_vector to hold the set intersection of the content of two std::vector's.

#pragma warning(disable:4786)
#include "sorted_vector.h"

size_t             //build intersection using set interface of sorted_vector

BuildIntersection(  const std::vector<int>& v0,const std::vector<int>& v1,
                    codeproject::sorted_vector<int,true>& svecIntersection)
{		
    codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
    for(int i=0;i<v1.size();i++){
        if( svec.find(v1[i])!=svec.end() ){
            svecIntersection.insert(v1[i]);
        }
    }
    return svecIntersection.size();
}

The code example shows the use of the member functions find and insert. If you replace codeproject::sorted_vector<int,true> by std::set<int> you get exactly the same result. This piece of code can be optimized for speed by replacing the insert calls in the loop by calls to push_back of the base container (a vector) followed by a call to the member function sort at the end of the loop.

#pragma warning(disable:4786)
#include "sorted_vector.h"

size_t             //same as previous example, optimized insertions 

BuildIntersection1(  const std::vector<int>& v0,const std::vector<int>& v1,
                    codeproject::sorted_vector<int,true>& svecIntersection)
{		
    codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
    codeproject::sorted_vector<int,true>::Cont& vInterSect 
        = svecIntersection.get_container();
    for(int i=0;i<v1.size();i++){
        if( svec.find(v1[i])!=svec.end() ){
            vInterSect.push_back(v1[i]);
        }
    }
    svecIntersection.sort();
    return svecIntersection.size();
}

Interface of sorted_vector

Member Coming from Description
Cont sorted_vector container type, the type of the container used to store the controlled sequence.
value_type vector The type of object, T, stored in the set.
key_type set/multiset The key type associated with value_type.
key_compare set/multiset Function object that compares two keys for ordering.
value_compare set/multiset Function object that compares two values for ordering.
pointer vector Pointer to T.
reference vector Reference to T
const_reference vector Const reference to T
size_type vector An unsigned integral type.
difference_type vector A signed integral type.
iterator vector Random access iterator used to iterate through a vector.
const_iterator vector Random access const iterator used to iterate through a vector
reverse_iterator vector Random access iterator used to iterate backwards through a vector.
const_reverse_iterator vector Random access const iterator used to iterate backwards through a vector.
iterator begin() const vector Returns an iterator pointing to the beginning of the sorted_vector.
iterator end() const vector Returns an iterator pointing to the end of the sorted_vector.
reverse_iterator rbegin() const vector Returns a reverse_iterator pointing to the beginning of the reversed sorted_vector
reverse_iterator rend() const vector Returns a reverse_iterator pointing to the end of the reversed sorted_vector.
size_type size() const vector Returns the size of the sorted_vector.
size_type max_size() const vector Returns the largest possible size of the sorted_vector.
bool empty() const vector true if the sorted_vec's size is 0.
key_compare key_comp() const set/multiset Returns the key_compare object used by the sorted_vector.
value_compare value_comp() const set/multiset Returns the value_compare object used by the sorted_vector.
sorted_vector() set/multiset Creates an empty sorted_vector.
sorted_vector(const key_compare& comp) set/multiset Creates an empty sorted_vector, using comp as the key_compare object.
VC++7.0:
template <class InputIterator>
sorted_vector(InputIterator f,
InputIterator l)

VC++6.0:
sorted_vector(const_iterator f, const_iterator l)
set/multiset Creates a sorted_vector with a copy of a range.
VC++7.0:
template <class InputIterator>
sorted_vector(InputIterator f,
InputIterator l,const key_compare& comp)

VC++6.0:
sorted_vector(const_iterator f, const_iterator l,const key_compare& comp)
set/multiset Creates a sorted_vector with a copy of a range, using comp as the key_compare object.
sorted_vector(const sorted_vector&) set/multiset The copy constructor.
sorted_vector& operator=(const sorted_vector&) set/multiset The assignment operator (assigns other sorted_vector )
sorted_vector& operator=(const Cont&) sorted_vector The assignment operator (assigns other vector<T> )
void swap(sorted_vector&) set/multiset Swaps the contents of two sorted_vector's
pair<iterator, bool>insert(const value_type& x) set/multiset Inserts x into the sorted_vector.
iterator insert(iterator pos, const value_type& x) set/multiset Inserts x into the sorted_vector, using pos as a hint to where it will be inserted.
VC++7.0:
template <class InputIterator>
void insert(InputIterator f, InputIterator l)

VC++6.0:
void insert(const_iterator first f, const_iterator l)
set/multiset Inserts a range into the sorted_vector.
void erase(iterator pos) vector Erases the element pointed to by pos.
size_type erase(const key_type& k) set/multiset Erases the element whose key is k.
void erase(iterator first, iterator last) vector Erases all elements in a range.
void clear() vector Erases all of the elements.
iterator
find(const key_type& k) const
set/multiset Finds an element whose key is k.
size_type
count(const key_type& k) const
set/multiset Counts the number of elements whose key is k.
iterator
lower_bound(const key_type& k) const
set/multiset Finds the first element whose key is not less than k.
iterator
upper_bound(const key_type& k) const
set/multiset Finds the first element whose key is greater than k.
pair<iterator, iterator>
equal_range(const key_type& k) const
set/multiset Finds a range containing all elements whose key is k.
bool operator==(const sorted_vector&,const sorted_vector&) vector Tests two sorted_vector for equality. This is a global function, not a member function. There is also an operator !=
bool operator<(const sorted_vector&, const sorted_vector&) vector Lexicographical comparison. This is a global function, not a member function. There are also operators <= >= and >
Cont& get_container() sorted_vector Returns a reference to the internal vector used to store the controlled sequence
void sort() sorted_vector Restores the sort order using key_compare
reference at(size_type i) sorted_vector Returns a reference to the element at *(begin()+i) with range check
const_reference at(size_type i) const sorted_vector Returns a reference to the element at *(begin()+i) with range check
const_reference operator[](size_type i) const sorted_vector Returns a reference to the element at *(begin()+i)
reference operator[](size_type i) sorted_vector Returns a reference to the element at *(begin()+i)
const_reference front() const sorted_vector Returns a reference to the first element
reference front() sorted_vector Returns a reference to the first element
reference back() sorted_vector Returns a reference to the last element
const_reference back() const sorted_vector Returns a reference to the last element
void pop_back() sorted_vector Removes the last element from the sorted_vector

Points of Interest

An interview with Alexander Stepanov (inventor of the STL) at http://www.stlport.org/resources/StepanovUSA.html , in which he proposed to implement a sorted container as an adapter of a std::container gave me the idea for this work. The real surprise to me was the unexpected good performance of sorted_vector compared to set/multiset. The outcome of my tests indicate, that the std::set/std::multiset has only one (important) advantage over a sorted vector, namely that iterators remain valid in a set/multiset, even when more elements are added.

The implementation work itself was easy and did not require any advanced C++/STL feature. It can be summarized as follows: Most of the member functions of sorted_vector forward the call to the vec_ data member, which is a std::vector. The set/multiset specific functions for inserting/erasing and locating elements mostly use STL algorithms to deal with the sorted sequence present in the vec_ data member. The low level interface consisting of the member functions get_container() (which returns a reference to the vec_ data member) and sort and stable_sort (which restore the sort order) are necessary to improve insertion performance (by allowing a temporary violation of the sorting order). Only one function was unexpectedly difficult to implement: The function unique, which must be called after sort and stable_sort in the case of a set. This function is part of the STL and requires a predicate as third argument. This predicate must return true, if the passed elements are equal. But the class sorted_vector only has access to a predicate which evaluates the < relation. In theory, it should be possible, to transform a predicate evaluating the < relation into another predicate evaluating ==, but in practice this is only possible, when the < predicate is implemented as object derived from std::binary_function. Ultimately, I had to implement unique myself because of that.

History

  • 1.0: November 19th, 2002; Initial release.
  • 1.1: November 28th, 2002; Documentation and code Update
    • changed namespace from std to codeproject
    • supports member templates for constructing/inserting from iterator range in case of VC++7.0

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here