Question

I'm passing an array to a library function which returns an array which is a subsequence of the input array. That is to say the orders of the first and second array are identical but the second array may be lacking any number of elements of the first array. There will be no duplicates in either array!

I want to then build a new array of all the elements which were in the input but are not in the output of the function.

For some reason though it sounds trivial I keep getting it wrong, especially at the ends of the arrays it seems.

Example 1 (typical):

input array a:

[ yyz, ltn, tse, uln, ist, gva, doh, hhn, vlc, ios, app, tlv, lcy ]

input array b:

[ yyz, ltn, tse, uln, ist, gva, doh, hhn, vlc, tlv, lcy ]

output array "diff":

[ ios, app ]

Example 2 (minimal, reveals some bugs when the difference is at the end of the strings):

input array a:

[ usa ]

input array b:

[ ]

output array "diff":

[ usa ]

(I'm going to implement it in JavaScript / jQuery but I'm more interested in a generic algorithm in pseudocode since I'll actually be dealing with arrays of objects. So please I'm looking for algorithms which specifically use array indexing rather than pointers like I would in C/C++)

Was it helpful?

Solution

As the second array b is a subset of the first array a with the same order, you can walk both in parallel, compare the current values, and take the current value of a if it is different from the current value of b:

var a = ['yyz','ltn','tse','uln','ist','gva','doh','hhn','vlc','ios','app','tlv','lcy'],
    b = ['yyz','ltn','tse','uln','ist','gva','doh','hhn','vlc','tlv','lcy'],
    diff = [];
var i=0, j=0, n=a.length, m=b.length;
while (i<n && j<m) {
    if (a[i] !== b[j]) {
        diff.push(a[i]);
    } else {
        j++;
    }
    i++;
}
while (i<n) {
    diff.push(a[i++]);
}

Or if you prefer just one while loop:

// …
while (i<n) {
    if (j<m && a[i] === b[j]) {
        j++;
    } else {
        diff.push(a[i]);
    }
    i++;
}

OTHER TIPS

In java i would probably do something like this if I hade to use Arrays. You will have to loop over all your objects you get back and you will have to compare them to all of thoese you sent in so you will in the worst case have a O(n^2) complexity I belive, but, you can probably improve this by sorting your list you send in and the use pointers to to check each position (but since you didnt want to use pointers I leave this sample out) then you might be able to compare this in O(n).

public void doYourJob(){
        Object[] allObjects = new Object[10]; //hold all original values
        Object[] recivedArray = yourBlackBox(allObjects); //send in the array an gets the smaller one
        Object[] missingArray = new Object[allObjects.length - recivedArray.length];
        for(Object inObj : allObjects){
            boolean foundObject = false;
            for(Object obj : recivedArray){
                if(inObj.equals(obj)){
                    foundObject = true;
                    break;
                }
            }
            if(!foundObject)
                missingArray add inObj //add the missing object. This is not correct java code. =)
        }
    }

If I were aloud to use something from the Collection interface then this would be much simpler since you can use a "myArray.contains()" method.

With Lists instead

public void doYourJob(){
        List<Object> allObjects = new ArrayList<Object>(); //hold all original values
        List<Object>  recivedArray = yourBlackBox(allObjects); //send in the array an gets the smaller one
        List<Object>  missingArray = new ArrayList<Object>();
        for(Object inObj : allObjects){
            if(!recivedArray.contains(inObj))
                missingArray.add(inObj);
        }
    }

Do you have a guaranteed ordering imposed on your arrays? If so, it should be relatively simple to do something like:

# our inputs are array1 and array2, array2 is the one with 0 or more missing elements
ix1 = 0
ix2 = 0
diff = new array
while ix2 < length(array2)
  while (ix1 < length(array1)) and (array1[ix1] != array2[ix2])
     add array1[ix1] to diff
     ix1 = ix1 + 1
  ix1 = ix1 + 1
  ix2 = ix2 + i

return diff

If you do not have an ordering, you can either impose one (sort both arrays) or you can use a hash table.

hash = new hash
diff = new array

for each element in array1
  hash[element] = 1

for each element in array2
  hash[element] = hash[element] + 1

for each key in hash
  if hash[key] == 1
    add hash[key] to diff

Both of these should run in (roughly) O(n), if (and only if) adding an element to an array is O(1) (if you double the size of the result array every time it gets filled, it's at least asymptotically O(1)).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top