std::lower_bound and std::find on a plain array

https://stackoverflow.com/questions/10853569

12-06-2021
|

Question

I like to use std::algorithm whenever I can on plain arrays. Now I have 2 doubts; suppose I want to use std::lower_bound what happens if the value I provide as argument is not found?

int a[] = {1,2,3,4,5,6};
int* f = std::lower_bound(a,a+6,20);

The result I have when printing *f is 20.

The same happens if I use std::find.

int a[] = {1,2,3,4,5,6};
int* f = std::find(a,a+6,20);

The result I have when printing *f is 20.

Is it always the case that the return value is the original argument when this is not found?
In terms of performance std::lower_bound performs better of std::find since it implements a binary search algorithm. If the array is big say max 10 elements, could std::find perform better? Behind the scenes std::lower_bound calls std::advance and std::distance ..maybe I might save on these calls as well?

Thanks a lot

AFG

Solution

The result I have is 20. (Later edited to: The result I have when printing *f is 20.)

No, the result you get is a+6. Dereferencing that invokes undefined behavior. It might print 20, it might print "Shirley MacLaine", or it might blow up your car.

Is it always the case that the return value is the original argument when this is not found?

The return value will always be the 2nd argument in your case, because 20 is larger than any other value in the array. If the value is not found, but smaller than some existing value, the return value points to the next larger item.

From cppreference.com, the return value of std::lower_bound is "iterator pointing to the first element that is not less than value, or last if no such element is found."

In terms of performance ...

Measure it. No other advice here will stand up to your actual empirical evidence.

Behind the scenes std::lower_bound calls std::advance and std::distance ..maybe I might save on these calls as well?

Almost certainly not. Those calls are almost certainly optimized in your case to single (or very few) instructions.

OTHER TIPS

There's one significant difference between the iterators returned by lower_bound and find. If lower_bound doesn't find the item, it will return the iterator where the item should be inserted to retain the sort order. If find doesn't find the item, it will return the ending iterator (i.e. the second argument to find). In your example since you're trying to find something off the end of the array, both return the same iterator - but that's a complete coincidence.

In your example you must not dereference f, because it's equal to a+6. You have anyway, so you're in UB territory, but I suppose that the value 20 happens to be on the stack immediately after the array a.

It's true that for small enough arrays, a linear search might be faster than a binary search. 10 is "small", not "big". If you have a program which is doing a lot of searches into small arrays, you can time each one and see.

There should be basically no overhead for std::advance and std::distance - any halfway competent C++ compiler will inline everything, and they'll turn into pointer addition and subtraction.

You could use the following implementation

int a[] = {1,2,3,4,5,6};

int f = lower_bound(a,a+6,20)-a;

Now if 20 is present in the array it will return the index of the element in the array a (0-based indexing used). If 20 is not present in the array it will return 6 i.e. the length of the array.

In worst case the item to be searched is present at (n-1)th index [when n is the size of the array]. Then f will be n-1.

f will be n or equal to the size of the array only when the item being searched is not present in the array.

Hope it answers your question.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow