How to find the last element of array in binary search

https://stackoverflow.com/questions/2260272

20-09-2019
|

Question

In binary Search algorithm, the upper-bound element is array.length-1, then how can I find the last element of an array?

If the lower-bound and upper-bound for element of an array of length 8 is 6 and 7 respectively, then my mid element coming out as:

mid=(6+7)/2 i.e. 6 in java

Solution

It really comes down to using the right compares with the correctly chosen midpoint. For instance (without variable type declarations),

binsearch(a,val,left,right){
    if(left==right) return left;
    mid = (left+right)/2;
    if(a[mid] < val)
        return binsearch(a,val,mid+1,right);
    else
        return binsearch(a,val,left,mid);
}

will give you the index of the leftmost element that matches val (even if it is the rightmost element in the array). You don't need to explicitly check the last two or round up rather than using the built in integer truncation.

However, if you want the index of the rightmost element that is equal to val then you need to change the < operator to > and mid should be given by

mid = (left+right+1)/2;

It's as simple as that.

Edit: One more thing, I looked at my code that I use for this and realized that you have to to also change the the calls to binsearch to end up on the rightmost element. I'll just post the full code for this (which I should have done in the first place). Here's a binary search to find the rightmost element equal to the val.

binsearch(a,val,left,right){
    if(left==right) return left;
    mid = (left+right+1)/2;
    if(a[mid] > val)
        return binsearch(a,val,left,mid-1);
    else
        return binsearch(a,val,mid,right);
}

OTHER TIPS

The simplest approach is to use half-open ranges. This way, your upper bound points one step after the last valid item in the array, though your lower bound points directly to the first item. However, during the search, you treat your range as inclusive - the out-of-range upper bound is a valid "no match found" result.

At the start of each iteration you have...

lower <= target <= upper

"target" means the index that will be found and returned.

You calculate mid as "(upper + lower) / 2". Since this truncates, mid can never be the same as upper, which is important. Since "upper" may be out-of-bounds, we can never legally evaluate "array [upper]".

To find the first item greater-than-or-equal-to the key...

if array[mid] >= k :  lower <= target <= mid
if array[mid] <  k :  mid+1 <= target <= upper

To find the first item greater-than the key...

if array[mid] >  k :  lower <= target <= mid
if array[mid] <= k :  mid+1 <= target <= upper

These subranges are inclusive, and must precisely meet but not overlap. A single item overlap at mid (an easy mistake) results in infinite loops, which is part of why we use mid+1 for one subrange.

Note that all that changes between the two searches is the comparison operator.

To find last-less-or-equal, find first-greater and subtract one from the result. You can get -1 if all items in the array are greater than the key.

Note - you only test key against mid in each iteration (you know that the lower and upper bounds are correct already) and you only do one conditional test.

Do the out-of-bounds check and equality check (if that's what you need) outside of the loop.

int find_first_ge (int key)
{
  int lower = 0;
  int upper = array.length;

  while (upper > lower)
  {
    int mid = (lower + upper) / 2;

    if (array [mid] >= key)  //  for find_first_gt, use ">" here
    {
      upper = mid;
    }
    else
    {
      lower = mid + 1;
    }
  }

  return lower;
}

NOTE

Edited to correct some mistakes that left this just as infinite-loopy as what it was meant to fix.

The trick is to ensure that the bisected subranges are exactly as needed after each key test, and always at least one smaller than the original full range - and through overconfidence and bad memory, that's exactly what I managed to get wrong. The above is based on real working code in a heavily-used library (searches within a node in a multiway-tree library), so if it's wrong I've got big problems ;-)

NOTE

Edited again to improve wording and simplify the subrange bounds descriptions (noting that although the ranges are half-open, they are treated as inclusive).

When your lower bound and your upper bound are within one of each other, check both.

You could round up each time.

(6+7)/2.0 == 6.5

Round it up and you'll come up with 7.

Or you could simply add one to your midpoint.

mid = (6+7)/2 + 1

Another way is changing your start or end point to +1 or -1 for the following recursion. This is what the wikipedia article on the subject shows in some implementations.

min = mid+1

max = mid-1

Binary search is notoriously tricky to get exactly right. There is a very thorough analysis on the various problems and edge cases, along with a correct implementation, in Programming Pearls, a book that every programmer should probably have read at least once.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow