Question

I'm looking for a PRNG (pseudo randomness) that you initially seed with an arbitrary array of bytes.

Heard of any?

Was it helpful?

Solution

Hashing your arbitrary length seed (instead of using XOR as paxdiablo suggested) will ensure that collisions are extremely unlikely, i.e. equal to the probability of a hash collision, with something such as SHA1/2 this is a practical impossibility.

You can then use your hashed seed as the input to a decent PRNG such as my favourite, the Mersenne Twister.

UPDATE

The Mersenne Twister implementation available here already seems to accept an arbitrary length key: http://code.msdn.microsoft.com/MersenneTwister/Release/ProjectReleases.aspx?ReleaseId=529

UPDATE 2

For an analysis of just how unlikely a SHA2 collision is see how hard someone would have to work to find one, quoting http://en.wikipedia.org/wiki/SHA_hash_functions#SHA-2 :

There are two meet-in-the-middle preimage attacks against SHA-2 with a reduced number of rounds. The first one attacks 41-round SHA-256 out of 64 rounds with time complexity of 2^253.5 and space complexity of 2^16, and 46-round SHA-512 out of 80 rounds with time 2^511.5 and space 2^3. The second one attacks 42-round SHA-256 with time complexity of 2^251.7 and space complexity of 2^12, and 42-round SHA-512 with time 2^502 and space 2^22.

OTHER TIPS

Why don't you just XOR your arbitrary sequence into a type of the right length (padding it with part of itself if necessary)? For example, if you want the seed "paxdiablo" and your PRNG has a four-byte seed:

paxd    0x70617864
iabl    0x6961626c
opax    0x6f706178
        ----------
        0x76707b70 or 0x707b7076 (Intel-endian).

I know that seed looks artificial (and it is since the key is chosen from alpha characters). If you really wanted to make it disparate where the phrase is likely to come from a similar range, XOR it again with a differentiator like 0xdeadbeef or 0xa55a1248:

paxd    0x70617864    0x70617864
iabl    0x6961626c    0x6961626c
opax    0x6f706178    0x6f706178
        0xdeadbeef    0xa55a1248
        ----------    ----------
        0xa8ddc59f    0xd32a6938

I prefer the second one since it will more readily move similar bytes into disparate ranges (the upper bits of the bytes in the differentiator are disparate).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top