Question

This question and this question got me thinking a little bit. For sorting an array of length $n$ with $k$ unique elements in $O(n + k \log k)$, we need to be able to store counts of values in the array. There are some suggestions, but I'm looking for a way to do this in worst case linear time. More specifically:

Given a list $A$ of $n$ elements with $k$ elements distinct, determine a list of tuples $U = \{(x_i, c_i)\}^k$ of all unique elements $x_i \in A$ such that $c_i$ is the count of element $x_i$ in $A$.

Here are some (failed) ideas I've had and have been suggested:

  1. Balanced Binary Search Tree - With this it will take $O(\log k)$ to insert into the tree and increase values. After inserts we could do a tree traversal in $O(k)$. Thus, total time comes out to $O(n \log k)$ which is too slow.
  2. Hash Map - With this we can get $O(1)$ expected inserts and thus $O(n)$ expected time. However, this is still not $O(n)$ worst case.
  3. Empty Space Mapping - Find the minimum and maximum element in $A$. Allocate (but do not initialize) enough memory to cover this range. Use this memory basically as a hash map and include a random hash so that we don't try to access corrupted memory. This strategy presents issues. (1) It's probabilistic with very very very low probability of failing, but still not guaranteed. Using memory like this limits us to floating-point or integer constraints.
  4. Associative Arrays - There are many other associative arrays that can be used, similar to hash maps and BSTs, but I am not finding any that match these constraints.

Maybe there is some obvious method I am missing, but I also think it could be potentially not be possible. What are your thoughts?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top