Pergunta

I want to use a PRNG to generate random patterns. I would provide the PRNG with a hash value as a seed. Ideally, the seed size would be 64-bit or 128-bit and I would expect no collisions if the seeds are unique.

Do PRNGs encounter 'collisions', where two different seeds produce the exact same sequence of random numbers?

For example, if one initializes a PRNG with all possible 32-bit, 64-bit or 128-bit numbers, one would expect zero collisions since all seeds would be different.

When considering a Linear congruential generator, the seed gets divided into a multiplier. Since a Lehmer LCG uses a multiplier of 231-1, supplying a random 32-bit hash would cause collisions, since it is dividing all possible 32-bit numbers into 231-1 numbers. Thus, a seed of 1, 232-1 and 231 are equivalent and produce the same sequence of numbers. This is a problem since I am expecting the collision resistance of 64 or 128-bit hashes in the PRNG.

For more complicated algorithms, it's not as simple to determine these cases. My language of choice is JavaScript, which has no built-in seeded PRNG. There are of course many options to remedy this in the sense of libraries or standalone functions.

The best/fastest available PRNG for JavaScript seems to be Alea. It has a JS and C implementation and is based on Marsaglia's MWC algorithm. The JS implementation takes a variable string seed, so it is not clear how many unique seeds are possible. Alea is described as having a period length of 2116, which if I understand correctly, would mean it utilizes up to 116 bits of a supplied seed (e.g. a 128-bit hash).

Basically what I am asking is, how would I determine if seeds collide in a PRNG? Is it based on the described period length?

A good illustration of the issue is to compare Jenkin's small PRNG (2007) with xoshiro128+ (2018). They are quite similar. They both use 32-bit arithmetic, store the state in a 128-bit array composed of four uint32's and return one uint32 as the result. The main difference is that Jenkins only uses a 32-bit seed that is repeated in b,c,d. While in xoshiro128+ the entire initial state is the seed. Thus I can supply a 128-bit hash to the latter as the seed and still benefit from the quality of the generator (without modifying the algorithm). That is essentially what I mean - being able to provide a very large seed.

As a workaround to this potential issue, I thought about simply using the hash function itself, and running further iterations of the mixing function to obtain new pseudo-random bits (in the same way as obtaining a new number in a PRNG). It is deterministic and appears random enough, but it seems unlikely that hash function outputs pass "statistical randomness tests" such as Diehard or BigCrush since it was not designed as a PRNG.

Also, there are no cryptographic requirements here. The hash function I would use would be MurmurHash3 or similar--anything good enough to pass SMHasher suite of tests and produces 64-bit or 128-bit hashes.

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição
Não afiliado a cs.stackexchange
scroll top