Question

$\newcommand\ldotd{\mathinner{..}}$Given that $A[1\ldotd n]$ are integers such that $0\le A[k]\le m$ for all $1\le k\le n$, and the occurrence of each number except a particular number in $A[1\ldotd n]$ is an odd number. Try to find the number whose occurrence is an even number.

There is an $\Theta(n\log n)$ algorithm: we sort $A[1\ldotd n]$ into $B[1\ldotd n]$, and break $B[1\ldotd n]$ into many pieces, whose elements' value are the same, therefore we can count the occurrence of each element.

I want to find a worst-case-$O(n)$-time-and-$O(n)$-space algorithm.

Supposing that $m=\Omega(n^{1+\epsilon})$ and $\epsilon>0$, therefore radix sort is not acceptable. $\DeclareMathOperator{\xor}{xor}$ Binary bitwise operations are acceptable, for example, $A[1]\xor A[2]$.

Was it helpful?

Solution

Here is an idea for a simple algorithm; just count all occurrences!

  1. Find $m = \max A$. -- time $\Theta(n)$
  2. "Allocate" array $C[0..m]$. -- time $O(1)$¹
  3. Iterate over $A$ and increase $C[x]$ by one whenever you find $A[\_]=x$. If $C[x]$ was $0$, add $x$ to a linear list $L$. -- time $\Theta(n)$
  4. Iterate over $L$ and find the element $x_e$ with $C[x_e]$ even. -- time $O(n)$.
  5. Return $x_e$.

All in all, this gives you a linear-time algorithm which may use (in the sense of allocating) lots of memory. Note that being able to random-access $C$ in constant time independently of $m$ is crucial here.

An additional $O(n)$ bound on space is more difficult with this approach; I don't know of any dictionary data-structure that offers $O(1)$ time lookup. You can use hash-tables for which here are implementations with $O(1 + k/n)$ expected lookup time ($n$ the table's size, $k$ the number of stored elements) so you can get arbitrarily good with linear space -- in expectation. If all values in $A$ map to the same hash value, you are screwed.


  1. On a RAM, this is implicitly done; all we need is the start position and maybe the end position.

OTHER TIPS

An almost trivial solution - which uses however $\Theta(n)$ space - is to use a hash map. Recall that a hash map has amortized runtime $\mathcal{O}(1)$ for adding and looking up elements.

Hence, we can use the following algorithm:

  1. Allocate a hash map $H$. Iterate over $A$. For each element $i \in A$, increase the number of occurences seen, i.e. $H(i)++$.

  2. Iterate through the key set of the hash map, and check which of the keys has an even count of occurences.

Now this is a simple algorithm which doesn't really use any large trick, but sometimes even this suffices. If not, you might want to specifiy what space restrictions you impose.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top