This is not written in stone, rather an idea to improve binary search algorithm performance.
Introduction
I stumbled upon an idea of how binary search algorithm could be improved by using three-boolean logic.
Background
You need to understand what std::binary_search
is and as it follows: std::lower_bound
, that is actual implementation of binary search in STL.
Using the Code
If you have a look at std::lower_bound
algorithm, it is easy to see that it uses only operator <
(less) to traverse the data. That is pure binary (two-boolean) logic. But this practically means the algorithm does ignore the case when it actually hits the number it is looking for, but continues traversing till it gets back again to that point.
For instance, if the number is located into the middle of the sorted array and algorithm will pick one immediately, it will be ignored until iterator will be traversing the left (in case of lower_bound
) side of the data.
Here are the changes I have made to test the idea:
namespace tribl
{
enum class CMP { LESS=-1, EQ=0, GRTR=1 };
template <typename T>
inline CMP sign(T x1) { return (x1 > 0) ? CMP::GRTR : ((x1 < 0) ? CMP::LESS : CMP::EQ); }
template <typename T>
inline CMP sign(T x1, T x2)
{ return (x1 > x2) ? CMP::GRTR : ((x1 < x2) ? CMP::LESS : CMP::EQ); }
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wreturn-type"
template <class T>
inline CMP cmp(T const& i1, T const& i2) { assert(0); }
#pragma GCC diagnostic pop
template<>
inline CMP cmp<int>(int const& i1, int const& i2)
{ return tribl::sign(i1, i2); }
template<>
inline CMP cmp<std::string>(std::string const& s1, std::string const& s2)
{
return static_cast<CMP>(tribl::sign(::strcmp(s1.c_str(), s2.c_str())));
}
template<typename _ForwardIterator, typename _Tp, typename _Compare>
inline _ForwardIterator
__lower_bound(_ForwardIterator __first, _ForwardIterator __last,
const _Tp & __val, _Compare __comp) {
typedef typename std::iterator_traits < _ForwardIterator > ::difference_type
_DistanceType;
_DistanceType __len = std::distance(__first, __last);
while (__len > 0)
{
_DistanceType __half = __len >> 1;
_ForwardIterator __middle = __first;
std::advance(__middle, __half);
switch(__comp(*__middle, __val))
{
case CMP::LESS:
__first = __middle;
++__first;
__len = __len - __half - 1;
break;
case CMP::GRTR:
__len = __half;
break;
case CMP::EQ:
return __middle;
}
}
return __first;
}
template < typename _ForwardIterator, typename _Tp >
inline _ForwardIterator
lower_bound(_ForwardIterator __first, _ForwardIterator __last,
const _Tp & __val) {
__glibcxx_function_requires(_ForwardIteratorConcept < _ForwardIterator > )
__glibcxx_function_requires(_LessThanOpConcept <
typename iterator_traits < _ForwardIterator > ::value_type, _Tp > )
__glibcxx_requires_partitioned_lower(__first, __last, __val);
return tribl::__lower_bound(__first, __last, __val, tribl::cmp<_Tp>);
}
}
Testing
./build/apps/app
Filling up integers ...Done.
Filling up strings ...Done.
Integer's test 1
Elapsed Time (std): 5407.23 milli-seconds
Elapsed Time (tribl): 4781.78 milli-seconds
Integer's test 2
Elapsed Time (std): 8579.47 milli-seconds
Elapsed Time (tribl): 5026.94 milli-seconds
String's test
Elapsed Time (std): 21043.4 milli-seconds
Elapsed Time (tribl): 13190.8 milli-seconds
As you can see, the diff
is considerable.
Here is my environment:
cat /proc/cpuinfo | grep 'name'| uniq
model name : Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
g++ --version
g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Notes
It is to be noted that I do not account myself as an expert in STL algorithms, just because I have never contributed to STL development. This article is rather a question. So if you find any flow in my humble work, do not hesitate to state it.
In that same time, I do not invent the vehicle. The idea of using additional data to improve the performance is as old as this world. Everything is a trade-off.
Reference
I tried to keep the code as plain as possible to avoid any possible side effects of optimizations.
History
- 2nd November, 2020: Initial version