Question

Sorry for this not being a "real" question, but Sometime back i remember seeing a post here about randomizing a randomizer randomly to generate truly random numbers, not just pseudo random. I dont see it if i search for it.

Does anybody know about that article?

Was it helpful?

Solution

I believe that was on thedailywtf.com - ie. not something that you want to do.

It is not possible to get a truly random number from pseudorandom numbers, no matter how many times you call randomize().

You can get "true" random numbers from special hardware. You could also collect entropy from mouse movements and things like that.

OTHER TIPS

I have to disagree with a lot of the answers to this question.

It is possible to collect random data on a computer. SSL, SSH and VPNs would not be secure if you couldn't.

The way software random number generator work is there is a pool of random data that is gathered from many different places, such as clock drift, interrupt timings, etc.

The trick to these schemes is in correctly estimating the entropy (the posh name for the randomness). It doesn't matter whether the source is bias, as long as you estimate the entropy correctly.

To illustrate this, the chance of me hitting the letter e in this comment is much higher than that of z , so if I were to use key interrupts as a source of entropy it would be bias - but there is still some randomness to be had in that input. You can't predict exactly which sequence of letters will come next in this paragraph. You can extract entropy from this uncertainty and use it part of a random byte.

Good quality real-random generators like Yarrow have quite sophisticated entropy estimation built in to them and will only emit as many bytes as it can reliably say it has in its "randomness pool."

At the end of the post, I will answer your question of why you might want to use multiple random number generators for "more randomness".

There are philosophical debates about what randomness means. Here, I will mean "indistinguishable in every respect from a uniform(0,1) iid distribution over the samples drawn" I am totally ignoring philosophical questions of what random is.

Knuth volume 2 has an analysis where he attempts to create a random number generator as you suggest, and then analyzes why it fails, and what true random processes are. Volume 2 examines RNGs in detail.

The others recommend you using random physical processes to generate random numbers. However, as we can see in the Espo/vt interaction, these processes can have subtle periodic elements and other non-random elements, in part due to outside factors with deterministic behavior. In general, it is best never to assume randomness, but always to test for it, and you usually can correct for such artifacts if you are aware of them.

It is possible to create an "infinite" stream of bits that appears completely random, deterministically. Unfortunately, such approaches grow in memory with the number of bits asked for (as they would have to, to avoid repeating cycles), so their scope is limited.

In practice, you are almost always better off using a pseudo-random number generator with known properties. The key numbers to look for is the phase-space dimension (which is roughly offset between samples you can still count on being uniformally distributed) and the bit-width (the number of bits in each sample which are uniformally random with respect to each other), and the cycle size (the number of samples you can take before the distribution starts repeating).

However, since random numbers from a given generator are deterministically in a known sequence, your procedure might be exposed by someone searching through the generator and finding an aligning sequence. Therefore, you can likely avoid your distribution being immediately recognized as coming from a particular random number generator if you maintain two generators. From the first, you sample i, and then map this uniformally over one to n, where n is at most the phase dimension. Then, in the second you sample i times, and return the ith result. This will reduce your cycle size to (orginal cycle size/n) in the worst case, but for that cycle will still generate uniform random numbers, and do so in a way that makes the search for alignment exponential in n. It will also reduce the independent phase length. Don't use this method unless you understand what reduced cycle and independent phase lengths mean to your application.

An algorithm for truly random numbers cannot exist as the definition of random numbers is:

Having unpredictable outcomes and, in the ideal case, all outcomes equally probable; resulting from such selection; lacking statistical correlation.

There are better or worse pseudorandom number generators (PRNGs), i.e. completely predictable sequences of numbers that are difficult to predict without knowing a piece of information, called the seed.

Now, PRNGs for which it is extremely hard to infer the seed are cryptographically secure. You might want to look them up in Google if that is what you seek.

Another way (whether this is truly random or not is a philosophical question) is to use random sources of data. For example, unpredictable physical quantities, such as noise, or measuring radioactive decay.

These are still subject to attacks because they can be independently measured, have biases, and so on. So it's really tricky. This is done with custom hardware, which is usually quite expensive. I have no idea how good /dev/random is, but I would bet it is not good enough for cryptography (most cryptography programs come with their own RNG and Linux also looks for a hardware RNG at start-up).

According to Wikipedia /dev/random, in Unix-like operating systems, is a special file that serves as a true random number generator.

The /dev/random driver gathers environmental noise from various non-deterministic sources including, but not limited to, inter-keyboard timings and inter-interrupt timings that occur within the operating system environment. The noise data is sampled and combined with a CRC-like mixing function into a continuously updating ``entropy-pool''. Random bit strings are obtained by taking a MD5 hash of the contents of this pool. The one-way hash function distills the true random bits from pool data and hides the state of the pool from adversaries.

The /dev/random routine maintains an estimate of true randomness in the pool and decreases it every time random strings are requested for use. When the estimate goes down to zero, the routine locks and waits for the occurrence of non-deterministic events to refresh the pool.

The /dev/random kernel module also provides another interface, /dev/urandom, that does not wait for the entropy-pool to re-charge and returns as many bytes as requested. As a result /dev/urandom is considerably faster at generation compared to /dev/random which is used only when very high quality randomness is desired.

John von Neumann once said something to the effect of "anyone attempting to generate random numbers via algorithmic means is, of course, living in sin."

Not even /dev/random is random, in a mathematician's or a physicist's sense of the word. Not even radioisotope decay measurement is random. (The decay rate is. The measurement isn't. Geiger counters have a small reset time after each detected event, during which time they are unable to detect new events. This leads to subtle biases. There are ways to substantially mitigate this, but not completely eliminate it.)

Stop looking for true randomness. A good pseudorandom number generator is really what you're looking for.

If you believe in a deterministic universe, true randomness doesn't exist. :-) For example, someone has suggested that radioactive decay is truly random, but IMHO, just because scientists haven't yet worked out the pattern, doesn't mean that there isn't a pattern there to be worked out. Usually, when you want "random" numbers, what you need are numbers for encryption that no one else will be able to guess.

The closest you can get to random is to measure something natural that no enemy would also be able to measure. Usually you throw away the most significant bits, from your measurement, leaving numbers with are more likely to be evenly spread. Hard core random number users get special hardware that measures radioactive events, but you can get some randomness from the human using the computer from things like keypress intervals and mouse movements, and if the computer doesn't have direct users, from CPU temperature sensors, and from network traffic. You could also use things like web cams and microphones connected to sound cards, but I don't know if anyone does.

To summarize some of what has been said, our working definition of what a secure source of randomness is is similar to our definition of cryptographically secure: it appears random if smart folks have looked at it and weren't able to show that it isn't completely unpredictable.

There is no system for generating random numbers which couldn't conceivably be predicted, just as there is no cryptographic cipher that couldn't conceivably be cracked. The trusted solutions used for important work are merely those which have proven to be difficult to defeat so far. If anyone tells you otherwise, they're selling you something.

Cleverness is rarely rewarded in cryptography. Go with tried and true solutions.

A computer usually has many readily available physical sources of random noise:

  • Microphone (hopefully in a noisy place)
  • Compressed video from a webcam (pointed to something variable, like a lava lamp or a street)
  • Keyboard & mouse timing
  • Network packet content and timing (the whole world contributes)

And sometimes

  • Clock drift based hardware
  • Geiger counters and other detectors of rare events
  • All sorts of sensors attached to A/D converters

What's difficult is estimating the entropy of these sources, which is in most cases low despite the high data rates and very variable; but entropy can be estimated with conservative assumptions, or at least not wasted, to feed systems like Yarrow or Fortuna.

It's not possible to obtain 'true' random numbers, a computer is a logical construct that can't possibly create 'truly' random anything, only pseudo-random. There are better and worse pseudo-random algorithms out there, however.

In order to obtain a 'truly' random number you need a physical random source, some gambling machines actually have these built in - often it's a radioactive source, the radioactive decay (which as far as I know is truly random) is used to generate the numbers.

One of the best method to generate a random number is through Clock Drift. This primarily works with two oscillators.

An analogy of how this works is imagine a race car on a simple oval circuit with a while line at the start of the lap and also a while line on one of the tyres. When the car completes a lap, a number will be generated based on the difference between the position of the white line on the road and on the tyre.

Very easy to generate and impossible to predict.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top