Question

This is a question I got in a job interview:

You are given two sorted arrays (sizes n and m), and a number x. What would be the best algorithm to find the indexes of two numbers (one from each array), that their sum equals the given number.

I couldn't find a better answer than the naive solution which is:

  1. Start from the smaller array, from the cell that contains the largest number which is smaller than x.
  2. For each cell in small array. do binary search on the big one, looking for the number so the sum will equal x.
  3. Continue until the first cell of the smaller array, returning the appropriate indexes.
  4. Return FALSE if no such numbers exist.

Can anyone think of a better solution in terms of runtime?

Was it helpful?

Solution

Use two indices i1,i2 - set i1=0, i2=n-1

while i1 < m && i2>=0:
    if arr1[i1] + arr2[i2] == SUM:
         return i1,i2
    else if arr1[i1] + arr2[i2] > SUM:
         i2--
    else
         i1++
return no pair found

The idea is to use the fact that the array is sorted, so start from the two edges of the arrays, and at each iteration, make changes so you will get closer to the desired element

Complexity is O(n+m) under worst case analysis, which is better than binary search approach if min{m,n} >= log(max{m,n})

Proof of correctness (guidelines):

Assume the answer is true with indices k1,k2.
Then, for each i2>k2- arr1[k1] + arr2[i2] > SUM - and you will NOT change i1 after reaching it before getting to i2==k2. Similarly you can show that when you get to i2==k2, you will NOT change i2 before you get i1==k1.
Since we linearly scan the arrays - one of i1 or i2 will get to k1 or k2 at some point, and then - you will continue iterating until you set the other iterator to the correct location, and find the answer.
QED

Notes:
If you want to output ALL elements that matches the desired sum, when arr1[i1]+arr2[i2] ==SUM, change the element with the LOWER absolute difference to the next element in the iteration order. It will make sure you output all desired pairs.

Note that this solution might fail for duplicate elements. As is, the solution works if there is no pair (x,y) such that x AND y both have dupes.
To handle this case, you will need to 'go back up' once you have exhausted all possible pairs in one direction, and the pseudo code should be updated to:

dupeJump = -1
while i1 < m && i2>=0:
    if arr1[i1] + arr2[i2] == SUM:
         yield i1,i2
         if arr1[i1+1] == arr1[i1] AND arr2[i2-1] == arr2[i2]:
             //remembering where we were in case of double dupes
             if (dupeJump == -1):
                 dupeJump = i2
             i2--
         else:
             if abs(arr1[i1+1] - arr1[i1]) < abs(arr2[i2-1] - arr2[i2]):
                 i1++
             else:
                 i2--
            //going back up, because there are more pairs to print due to dupes
             if (dupeJump != -1):
                 i2 = dupeJump 
                 dupeJump = -1
    else if arr1[i1] + arr2[i2] > SUM:
         i2--
    else
         i1++


Note however that the time complexity might increase to O(n+m+size(output)), because there could O(n*m) such pairs and you need to output all of them (note that every correct solution will have this restriction).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top