I have a simple std::vector
containing some numbers, which are sorted (in ascending order). I want to lookup an element, so far I use:
return std::lower_bound(vec.begin(), vec.end(), needle);
Where needle
is the element I look for. However, my vector tends to be quite long (millions of elements), but most of the time the contents are relatively predictable in a sense that if the first element is zero and the last element is N
, then the elements in between have value close to (N * index) / vec.size()
and are hence predictable.
Is there a modification of the lower bound, which would accept a hint (similarly to how std::map::emplace_hint()
does), such as:
assert(!vec.empty());
std::vector<int>::iterator hint = vec.begin() + std::min(vec.size() - 1,
(needle * vec.size()) / vec.back());
if(*hint > needle)
return std::lower_bound(vec.begin(), hint, needle);
else
return std::lower_bound(hint, vec.end(), needle);
This will work, but the lower_bound
ignores that it is close to the solution and will most likely start splitting the interval to halves (looking where we know that the needle most likely isn't), taking unnecessarily many steps. I know that there was an algorithm which starts with step 1, which it doubles until it overshoots the needle, and then does binary search in the given interval.
I forgot what is the name of the algorithm. Is it implemented in the STL?
Binary Search is a searching algorithm for finding an element's position in a sorted array. In this approach, the element is always searched in the middle of a portion of an array. Binary search can be implemented only on a sorted list of items. If the elements are not sorted already, we need to sort them first.
The binary_search() function returns a Boolean value. It returns true if the value is present in the array or vector and false if the value is not present.
Case 2: When the data is sorted, a Binary Search will be more time efficient as it will only take O(logn) time, while the Sequential Search will still take O(n) time. Save this answer. Show activity on this post. Binary search is always better.
I think the algorithm you're looking for is called interpolation search which is a variation on binary search that, instead of looking at the midpoint of the array, linearly interpolates between the array endpoints to guess where the key should be. On data that's structured the way that yours is, the expected runtime is O(log log n), exponentially faster than a standard binary search.
There is no standard implementation of this algorithm in C++, but (as a totally shameless plug) I happened to have coded this one up in C++. My implementation is available online if you're interested in seeing how it works.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With