Recently, I gave a group of developers a task which can be simplified to the following simple problem: you have a sorted array of elements; find the index of a given element in this array. They came up with following solution:
int[] sortedArray = new[] { 1, 5, 8, 12, 18, 20 };
List<int> list = new List<int>(sortedArray);
int result = list.IndexOf(8);
Of course, it works fine but didn’t we do too much work? Can we probably use the fact that the given array is sorted? Do we need an additional data structure List
?
If we read the Performance Considerations section in the MSDN class documentation, we see that to perform IndexOf()
operation, the list first sorts the content and then performs the binary search on it. See: List<> at MSDN.
So we can simply use the BinarySearch()
method of the Array static
class.
int result = Array.BinarySearch(sortedArray, 8);
So the BinarySearch()
does basically the same as the IndexOf()
does, but quicker, sometimes it’s better to remember basic algorithms instead of using complex data structures to accomplish the very narrow task efficiently.
For more information on binary search algorithm, see Binary Search (Wikipedia).
Why does the BinarySeacrh() do more than IndexOf() does?
Let’s search a non existing element using both methods and compare results.
int[] sortedArray = new[] { 1, 5, 8, 12, 18, 20 };
List<int> list = new List<int>(sortedArray);
int result1 = list.IndexOf(17);
int result2 = Array.BinarySearch(sortedArray, 17);
The result looks strange:
IndexOf(17)==-1
BinarySearch(17)==-5
Both of them deliver a negative value which should be interpreted as element not found. However as opposed to List.IndexOf()
which always delivers -1, the Array.BinarySearch()
returns some “magic” negative value. The following explanation can be found in class documentation:
Return Value: The index of the specified value in the specified array, if value is found. If value is not found and value is less than one or more elements in array, a negative number which is the bitwise complement of the index of the first element that is larger than value. If value is not found and value is greater than any of the elements in array, a negative number which is the bitwise complement of (the index of the last element plus 1).
For more information, see Array.BinarySearch Method (MSDDN).
Sounds complicated but means the following: the binary search travels the binary tree, at some point it realizes that element cannot be found and stops on a certain node. If you negate the result of the binary search and subtract 1, you will get the index of this element.
Index | 0 | 1 | 2 | 3 | 4 | 5 |
Element | 1 | 5 | 8 | 12 | 18 | 20 |
So if you search for 7, the result will be the negated index of element 8 minus 1, equals 2.
So if you search for 17, the result will be the negated index of element 18 minus 1, equals 4.
The formula is:
- Array.BinarySearch(sortedArray, 22) - 1
Important: If you are searching for a value which is greater than the last element, this formula gives you array length, which is a non existing index. So you should not use the result of this calculation to access an element of the array without checking if it’s less than length.
So if you search for 22, the result will be -7. -(-7) -1 = 6 = array length.
The return value of BinarySearch()
can be used in wide range of algorithms. For instance, when solving the problem of vertex collision detection. It might be useful also when working with text segments.