Question

Given a list of 4 billion integers, find an integer not in the list using 4MB of memory. (interview was in Java)

My solution is to use a BitSet.

However according to my calculations there aren't enough bits in 4 MB of memory! = c

4 Megabytes = 4096 KB # multiply by 8

4096 KB =~ 4,096,000 Bytes # multiply by 1000

4,096,000 Bytes =~ 33,500,000 bits # multiply by 8

So 33,500,000 bits is two orders of magnitude less than a billion. Let a lone the 4 billion.


Or is it part of the question to work with this limitation?

Was it helpful?

Solution

There's nothing in the question stating you can only make one pass of the data.

So, assuming it's not a mistake, you could still use a bit set but do it in groups. For the first pass, check only the numbers from zero to thirty million (roughly). Second pass, check from thirty to sixty million. And so on.

That would still allow you to find a missing number within the constraints of the question.

OTHER TIPS

You wanted a bloom filter.

Unlike a standard hash table, a Bloom filter of a fixed size can represent a set with an arbitrary large number of elements; adding an element never fails due to the data structure "filling up." However, the false positive rate increases steadily as elements are added until all bits in the filter are set to 1, at which point all queries yield a positive result.

Some additional reading

iterate through the list and keep the max number. Then int notInList = max + 1

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top