Pergunta

We are given a random number generator RandNum50 which generates a random integer uniformly in the range 1–50. We may use only this random number generator to generate and print all integers from 1 to 100 in a random order. Every number must come exactly once, and the probability of any number occurring at any place must be equal.

What is the most efficient algorithm for this?

Foi útil?

Solução

I thought (so it can be wrong :-) of this $O(N^2)$ solution that uses the Fisher-Yates shuffle. In order to keep uniform distribution with good approximation (see EDIT section below) at every iteration you can use this trick to produce a value krand between $0$ and $k-1$:

 // return a random number in [0..k-1] with uniform distribution
 // using a uniform random generator in [1..50]
 funtion krand(k) {    
   sum = 0
   for i = 1 to k do sum = sum + RandNum50() - 1
   krand = sum mod k
 }

The Fisher-Yates algorithm becomes:

arr : array[0..99]
for i = 0  to 99 do arr[i] = i+1; // store 1..100 in the array
for i = 99 downto 1 {
  r = krand(i+1)  // random value in [0..i]
  exchange the values of arr[i] and arr[r]
}
for i = 0 to 99 do print arr[i]

EDIT:

As pointed out by Erick the krand function above doesn't return a truly uniform distribution. There are other methods that can be used to get a better (arbitrarily better) and faster approximation; but (up to my knowledge) the only way to get a truly uniform distribution is to use the rejection sampling: pick $m = \lceil \log_2(k) \rceil$ random bits and if the number $r$ obtained is less than $k$ return it, otherwise generate another random number; a possible implementation:

function trulyrand(k) {
    if (k <= 1) return 0
    while (true) { // ... if you're really unlucky ...
      m = ceil(log_2 (k) ) // calculate m such that k < 2^m
      r = 0  // will hold the random value
      while (m >= 0) {  // ... will add m bits        
        if ( rand50() > 25 ) then b = 1 else b = 0   // random bit
        r = r * 2 + b  // shift and add the random bit
        m = m - 1
      }      
      if (r < k) then return r  // we have 0<=r<2^m ; accept it, if r < k
    }
}

Outras dicas

Since other folks have given approximate solutions and solutions involving taking indeterminate numbers of deviates, how about a proof that there is no such algorithm that's guaranteed to only require a finite number of RandNum50() calls?

As others have noted, printing the numbers from 1-100 in random order is equivalent to printing a random permutation of these numbers; there are 100! of these permutations, and so any particular permutation must be output with probability $\frac{1}{100!}$.

But if we knew that our algorithm used at most $k$ calls to RandNum50 for some $k$, then we could argue as follows: firstly, pad out those computation paths that make fewer than $k$ calls to RandNum50 to make additional dummy calls (that is, calls where the returned value is irrelevant), so that all computation paths make precisely $k$ calls. Any given sequence of $k$ results from our calls to RandNum50 must result in some output permutation, and so we can build an 'outcomes table' that maps any given sequence $(r_1, r_2, \ldots, r_k)$ of results from our calls into a particular output permutation. Since each of these outcomes is equally likely (each of them has probability $\displaystyle\frac{1}{50^k}$), then the probability of getting any particular permutation out of our algorithm must be of the form $\displaystyle\frac{c}{50^k}$ for some $c$. But $\displaystyle\frac{1}{100!}$ can't be of this form, because $100!$ doesn't divide $50^k$ for any $k$ (for instance, 3 divides $100!$ but can't divide any number of the form $50^k$). This means that no possible distribution of outcomes to random-number calls can produce a uniform permutation.

The previous solutions are not optimal. The complexity is exactly $n\log n + O(1)$ in calls to RandNum50 and is described in some detail here, using as a source of random bit (as suggested by Vor):

if ( rand50() > 25 ) then b = 1 else b = 0   // random bit

The basic idea is that you save a lot of bits if you generate a uniform between $1$ and $n!$, and then using factorial base decomposition, instead of generating a sequence of uniforms ranged up to $1$, then $2$, then $3$, etc., $n$. This is actually, as I mention in the post, the topic of a paper I have submitted!

If you do not know how to generate a uniform, as suggested in that post, from a random bit, you could also generate an approximation of the uniform directly, in this way (which is equivalent to Vor's "trulyrand", but faster):

P = (RandNum50()-1) + (RandNum50()-1)*50^1 + (RandNum50()-1)*50^2 + ...

going as far as you need to go. This is developing $P$ in base $50$. Then simply truncate $P$, i.e., $Q=P\mod n$, in your case $n=100!$. This value is not completely random, but it is a measure of uniformity that is often used. Or, as Vor suggests, you can reject if $P>n$. Then with this value, you can do the factorial base expansion as described in the post.

I haven't done the analysis to confirm how uniform (or not) this would be, and it could be adjusted to be a true shuffle, but could you just choose, from a starting array of the ith index = i + 1, the (k + RandNum50() + RandNum50() - 1) mod (100 - k) index, with removal, for k = 0..99?

This "pushes" the peak in the RandNum50() + RandNum50() distribution forward uniformly.

I'm pretty sure this is not quite right as I've stated it because the 0 index (1) is not obtainable from the first choice and I cannot quickly see an alternative 1..50 + 1..50 adjustment that produces 0..99.

Update

To fix the issue I noted, I effectively used RandNum100 as mentioned in the question comments to randomally initialise the first k offset.

This produces a distribution with a significant wave at the front.

Instead of advancing by 1 I used another RandNum50 to increment that first k. This produces a result that is random enough for me, but it is still not "truly" random, as can be easily seen if you change K to 2.

Testing VB.NET code where I catered for any even K. Note it is O(K), 6K+2 in fact.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a cs.stackexchange
scroll top