سؤال

I'm looking for an efficient algorithm that produces random values within a range, without repetitions.

In pseudo-code: (in class Rand)

  Rand(long from, long to) {
    this.from = from;
    this.to = to;
    // ...
  }

  long getNumber() {

    // returns a random number in the [from, to] range
    //  which has never been returned before
  }

Usage:

  Rand r = new Rand(1, 100000000);

  long x = r.getNumber();
  long y = r.getNumber();
  ...

The numbers returned from r.getNumber() should always be different from the ones previously returned.
Of course, if to - from + 1 numbers were returned, the algorithm should start again (or fall in error - not important anyway).

Note that the range may be very large, and thus a randomly arranged array (containing initially [from, to] numbers) is likely to overflow the memory.

هل كانت مفيدة؟

المحلول

A cypher is a one-to-one mapping, otherwise it couldn't be decrypted. Hence, any block cypher will map the numbers 0, 1, 2, 3, 4, 5, ... to different n-bit numbers, where n is the cypher's block size in bits.

It is relatively easy to put together a simple 4-round Feistel cypher with whatever (even) block size you want. With only four rounds it would be fast but insecure. Alternatively use the Hasty Pudding cypher which can have pretty much any block size you want.

Whatever cypher you use, just encrypt the numbers 0, 1, 2, ... and look at the output block. You can throw away any results that are outside your required range and all results are guaranteed unique.

نصائح أخرى

One way to do this would be to generate a list of numbers between from and to, removing these at random until the bag is empty, at which point it's re-populated. To save on storage for large ranges, you can record picked numbers up to a point (re-picking when a duplicate is chosen), since the probability of picking a duplicate should be low initially. Determining the optimal transition point will probably be an empirical exercise.

EDIT: Some more thoughts.

For truly huge ranges, not even that will provide good performance under the memory limitation. One idea might be to store the candidates not as a list of numbers, but as an interval. So, initially, you choose between from and to, getting x1. Next time, pick from a number from the first subinterval or the second, with probability in proportion to the interval length. At each step, eliminate intervals which have zero length. This requires storing M + 2 integers (in the worst case), where M is the number of draws, or N/2 asymptotically for large N (in the worst case), where N is the initial interval size. Somebody might double-check me, though.

If you don't require every number from the interval to appear eventually, you can use a linear congruental generator:

int getNumber() {
    seed = (seed * A + C) mod (to-from);
    return seed + to;
}

It's periodical, the new period begins when seed becomes equal to the initial value, and the length of the period depends on A and C choice.

Pros: O(1) time and space, cons: not every number from the interval will appear.

For intervals of length 2^m, take a look at http://en.wikipedia.org/wiki/Linear_feedback_shift_register I did not use it, but wikipedia says it is possible it to be maximum-length, i.e. you can have all numbers (except one) appear in the output.

Some thoughts for a starting point:

1) Supposes that you have a function f(x) which is a permutation on 1..N where N is larger than your range. If you apply it to x within the range it may produce an illegal value - one outside your range. You could define a permutation within your range by simply calling f again on the illegal value. You will eventually come to a legal value because the sequence x, f(x), f^2(x), f^3(x) must eventually cycle and if the worst comes to the worst it will come back in at x.

2) There are switching networks which allow you to produce all possible permutations on N objects for special N - one example is http://en.wikipedia.org/wiki/Clos_network#Bene.C5.A1_network_.28m_.3D_n_.3D_2.29 (Funny URL - Benes network). You can get an arbitrary permutation on N objects, where N may I think have to be a power of 2, by setting the switches at random. Since there will be K switches there are 2^K ways of setting them which means that you do not have M! permutations with equal probability but perhaps you won't mind this, or can minimise the non-randomness by repeating this multiple times, or something.

3) If you are prepared to achieve near-randomness by applying many different base permutations many different times you could try alternately adding mod N across the whole range and then splitting the range into sub-ranges and e.g. for some stretch of p-1 values within the range apply the permutation produced by multiplying by some k mode p. The hope would be that although each individual step is pretty basic by appling enough of them and making them diverse enough the result would be close to being random, especially if you chose the parameters at random.

I will pretend you are asking for a solution in R (according to tag). What you're trying to do is sample without replacement. In R, there's a function called sample. Here, I'm sampling a vector of 30 values (1, 2, 3... 30), and once I draw a number, it's not replaced. You can make this reproducible on other machines by setting a seed (see set.seed).

I ran this a number of times to show the randomness.

> sample(x = 1:30, size = 10, replace = FALSE)
 [1]  9 16 13 20 12  3  1  5 28  7
> sample(x = 1:30, size = 10, replace = FALSE)
 [1] 22 11 26 29 20  1  3  6  7 10
> sample(x = 1:30, size = 10, replace = FALSE)
 [1]  1 11 16  7 22 26  3 25  8  9
> sample(x = 1:30, size = 10, replace = FALSE)
 [1]  7 17  3 22 21 24 27 12 28  2
> sample(x = 1:30, size = 10, replace = FALSE)
 [1] 30 21 23  2 27 24  3 18 25 19
> sample(x = 1:30, size = 10, replace = FALSE)
 [1]  4  6 11 16 26  8 17 22 23 25

You could proceed this way (in python):

Create a xrange and pick k random element from it and use it as a RanndomGenerator:

import random

def randomFromTo(FROM,TO,k): #k is the number of sample you want to generate
    m= random.sample(xrange(FROM,TO),k)
    return (i for i in m)

Your Generator will generate all number randomly from FROM to TO and fail when it has generated more than k numbers:

with this example you would get :

RandomGenerator=randomFromTo(10,1000000000,12)

for k in range(12): 
    print RandomGenerator.next()

you get

57625960
50621599
2891457
56292598
54608718
45258991
24112743
55282480
28873528
1120483
56876700
98173231

@rossum said:

A cypher is a one-to-one mapping, otherwise it couldn't be decrypted. Hence, any block cypher will map the numbers 0, 1, 2, 3, 4, 5, ... to different n-bit numbers, where n is the cypher's block size in bits.

So even xor encryption or bitwise reverse would do for some purposes.

And here is a php function using xor and bitwise reverse as a simple 1-to-1 encryption.

It is a pseudo random number generator with guaranteed fill in of all values and no identical values. You supply n:0..63 and it provides a random 0..63.

It only accepts 2^i ranges, like 0..63, 0..127 etc.

Is not cryptographically safe etc, just random.

Such a function is extremely suitable for garbage collection routines, as it will not attempt to clean the same area twice, even though doing things randomly.

 function math_random_filled($n,$bits,$seed)
 {
   //bits: examples: 6=0..63, 8=0..255, 10: 0..1023
   //n: 0<= n <2^$bits
   //seed: any string or number

   //generate xor: crc32 + bit mask
   $xor= crc32($seed.'internalseed') & ~(-1<<$bits);
   //xor
   $r=intval($n)^$xor;
   //bitwise reverse
   $r=bindec(strrev(str_pad(decbin($r),$bits,'0',STR_PAD_LEFT)));
   return $r;
 }

 //demonstration
 $bits=6;
 $min=0;
 $max=pow(2,$bits)-1;
 $count=$max-$min+1;
 for($n=0;$n<=$max;$n++)
 {
   $r=math_random_filled($n,$bits,$seed='someseed');
   echo " $r ";
   $ar[$r]=1;
 }


 $set=0;
 for($n=0;$n<=$max;$n++)
   if(isset($ar[$n])) $set++;


 echo "\n"."bits: $bits,  count: $count, set: ". $set."\n\n";

example output:

 37  5  53  21  45  13  61  29  33  1  49  17  41  9  57  25  39  7  55  23  47  15  63  31  35  3  51  19  43  11  59  27  36  4  52  20  44  12  60  28  32  0  48  16  40  8  56  24  38  6  54  22  46  14  62  30  34  2  50  18  42  10  58  26 

bits: 6,  count: 64, set: 64

you can test the code here in php sandbox

And here is another one, but accepts any range, not only powers of 2.

function math_random_filled_arithmetical($n,$max,$seed)
    {
    /*
    - produces 0..max, repeatable, unique, one-to-one mapped random numbers
    - uses arithmetic operations to imitate randomness
    - $n: 0<=$n=<$max
    - $max: any integer, not only power of two
    - $seed: any string or number
    */
    $n=intval($n);
    $max=intval($max);
    $opt1=$n;
    $opt2=$max-$n;
    $n2=min($opt1,$opt2);
    $reverseit=crc32($seed.'internalseed'.$n2)&1;
    if($opt1>$opt2) $reverseit=!$reverseit;
    $max2=floor(intval($max-1)/2);
    //echo "n:$n, max:$max,n2:$n2,max2:$max2,reverseit:$reverseit\n";
    if($max>=3)
    if($opt1!=$opt2)
        $n2=math_random_filled_arithmetical($n2,$max2,$seed.'*');
    $res=$reverseit? $max-$n2:$n2;
    $res=intval(fmod($res+(crc32($seed)&(1<<30)),$max+1));
    //echo "n:$n, max:$max, res:$res\n";
    return $res;
    }


//demonstration
$max=41;//-- test a max value 
for($n=0;$n<=$max;$n++)
    {
    $r=math_random_filled_arithmetical($n,$max,$seed='someseed');
    $ar[$r]=1;
    echo " $r ";
    }
$filled=0;
for($n=0;$n<=$max;$n++)
    if(isset($ar[$n])) $filled++;
echo "\n count: ".($max+1).", filled: ". $filled."\n";

example output:

 20  19  18  17  33  32  37  36  14  13  31  34  35  26  16  11  12  3  39  40  0  41  1  2  38  29  30  25  15  6  7  10  28  27  5  4  9  8  24  23  22  21 
 count: 42, filled: 42

related php sandbox is here

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top