Question

This is kind of a homework question, I've been thinking about it for quite a while, and came up with a couple of solutions but I think a better one exists.

What's the fastest way to determine if there is an element (int) in the array that appears only once? Any element can appear any number of times. {3, 1, 4, 1, 4, 3} will return false while {3, 1, 4, 1, 4, 1} would return true (3 appears once).

We are only allowed to use things we already learned (all the basics, recursion, oop, searching and sorting algos, including quicksort) so making a hash table is not an option.

So far the best practical solution I came up with is sorting it using quicksort then going through it ( O(nlogn) ), the best unpractical solution I came up with is making a big array the size of all possible int values and then using it's place similar to a hash table (but that array is WAY too big to actually implement) ( O(n) )

Is there another (practical) way to do this in O(n) time?

EDIT: just got an answer from the TA, the suggested O(n) solution that I heard about was an unpractical one (the same or similar to what I suggested) and hence they told us not to use it. I'm 99% sure now that the best practical answer (without hash tables) is O(nlogn) time.

Was it helpful?

Solution

You could use a customised quicksort to find distinct values without iterating over the sorted array afterwards.

When you have chosen a pivot value and are moving through the respective part of the array, IF the value matches the pivot, discard it AND discard the pivot value after you have moved through the part of the array, this would remove duplicates BEFORE the array is eventually sorted.

ie:

Sorting [5, 1, 4, 1, 4, 1]
If you choose the pivot as 4, you'd end up with the 2 sub arrays being:
[1, 1, 1] and [5]

If your pivot is never discarded, it is distinct, if it is discarded do the same process on the sublists. If a sublist has only 1 element, it is distinct.

In this way you can pick up distinct values MUCH earlier.

Edit: Yes this is still bounded by O(nlogn) ( I think ?)

OTHER TIPS

You essentially have to do a bubble-sort style compare. There's no built-in function to answer the problem, and even if you sort, you still have to iterate over every element (even just to find when groups break). You could do some more complicated approaches with multiple arrays, especially if you need to find which elements return only once.

But once you find one that appears once, you can break. This code would do it. It's O(n^2), but I'm not sure you can do faster for this problem.

boolean anySingles(int[] data])
{
 outer:
 for (int i = 0; i < data.length - 1; i++)
 {
  for (int j = 0; i < data.length; j++)
  {
   if (i != j)
   {
    if (data[i] == data[j]) continue outer;
   }
  }
  // made it to the end without finding a duplicate
  return true;
 }
 return false;
}

Let's do an experiment:

package test;

import java.util.Arrays;
import java.util.HashSet;
import java.util.Random;
import java.util.Set;

/**
 * Created with IntelliJ IDEA.
 * User: Nicholas
 * Date: 15.05.13
 * Time: 21:16
 */
public class Searcher {

    private static boolean searchBySorting(int [] array){
        int [] newArray = new int[array.length];
        System.arraycopy(array, 0, newArray,0, array.length);

        Arrays.sort(newArray);
        for (int i = 0; i < newArray.length - 2; ++i){
            if(newArray[i] == newArray[i + 1]){
                return true;
            }
        }

        return false;
    }

    private static boolean searchByCompare(int [] array){
        int [] newArray = new int[array.length];
        System.arraycopy(array, 0, newArray,0, array.length);

        for (int i = 0; i < newArray.length - 1; ++i){
            int value = newArray[i];
            for(int j = i + 1; j < newArray.length - 1; ++j){
                if(value == newArray[j]){
                    return true;
                }
            }
        }

        return false;
    }

    private static boolean searchBySet(int [] array){
        int [] newArray = new int[array.length];
        System.arraycopy(array, 0, newArray,0, array.length);

        Set<Integer> set = new HashSet<Integer>();
        for (int i = 0; i < newArray.length; ++i){
            if(set.contains(newArray[i])){
                return true;
            }

            set.add(newArray[i]);
        }

        return false;
    }

    private static int [] generateRandomArray(){
        Random random = new Random();
        int size = random.nextInt(1000) + 100;
        int [] array = new int[size];

        for (int i = 0; i < size; ++i){
            array[i] = random.nextInt();
        }

        return array;
    }

    public static void main(String [] args){

        long sortingTime = 0;
        long compareTime = 0;
        long setTime = 0;

        for (int i = 0; i < 1000; ++i){
            int [] array = generateRandomArray();

            long begin = System.currentTimeMillis();
            for(int j = 0; j < 100; ++j){
                searchBySorting(array);
            }
            long end = System.currentTimeMillis();
            sortingTime += (end - begin);

            begin = System.currentTimeMillis();
            for(int j = 0; j < 100; ++j){
                searchByCompare(array);
            }
            end = System.currentTimeMillis();
            compareTime += (end - begin);

            begin = System.currentTimeMillis();
            for(int j = 0; j < 100; ++j){
                searchBySet(array);
            }
            end = System.currentTimeMillis();
            setTime += (end - begin);
        }

        System.out.println("Search by sorting: " + sortingTime + " ms");
        System.out.println("Search by compare: " + compareTime + " ms");
        System.out.println("Search by insert: " + setTime + " ms");
    }
}

My results:

Search by sorting: 2136 ms

Search by compare: 11955 ms

Search by insert: 4151 ms

Are there any questions?

PS. The best algorithm I know is Tortoise and hare

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top