Random generator considerations in the design of randomized algorithms

https://cs.stackexchange.com/questions/11726

16-10-2019
|

Pergunta

It is well known that the efficiency of randomized algorithms (at least those in BPP and RP) depends on the quality of the random generator used. Perfect random sources are unavailable in practice. Although it is proved that for all $0 < \delta \leq \frac{1}{2}$ the identities BPP = $\delta$-BPP and RP = $\delta$-RP hold, it is not true that the original algorithm used for a prefect random source can be directly used also for a $\delta$-random source. Instead, some simulation has to be done. This simulation is polynomial, but the resulting algorithm is not so efficient as the original one.

Moreover, as to my knowledge, the random generators used in practice are usually not even $\delta$-sources, but pseudo-random sources that can behave extremely badly in the worst case.

According to Wikipedia:

In common practice, randomized algorithms are approximated using a pseudorandom number generator in place of a true source of random bits; such an implementation may deviate from the expected theoretical behavior.

In fact, the implementations of randomized algorithms that I have seen up to now were mere implementations of the algorithms for perfect random sources run with the use of pseudorandom sources.

My question is, if there is any justification of this common practice. Is there any reason to expect that in most cases the algorithm will return a correct result (with the probabilities as in BPP resp. RP)? How can the "approximation" mentioned in the quotation from Wikipedia be formalized? Can the deviation mentioned be somehow estimated, at least in the expected case? Is it possible to argue that a Monte-Carlo randomized algorithm run on a perfect random source will turn into a well-behaved stochastic algorithm when run on a pseudorandom source? Or are there any other similar considerations?

Solução

Here is one good justification. Suppose you use a cryptographic-strength pseudorandom number generator to generate the random bits needed by some randomized algorithm. Then the resulting algorithm will continue to work, as long as the crypto algorithm is secure.

A cryptographic-strength pseudorandom number generator is a standard tool from cryptography that accepts a short seed (say, 128 bits of true randomness) and generates an unlimited number of pseudorandom bits. It comes with a very strong security guarantee: as long as the underlying cryptographic primitive is not broken, then the pseudorandom bits will be completely indistinguishable from true-random bits by any feasible process (and, in particular, no efficient algorithm can distinguish its output from a sequence of true random bits). For instance, we might get a guarantee that says: if factoring is hard (or, if RSA is secure; or, if AES is secure), then this is a good pseudorandom generator.

This is a very strong guarantee indeed, since it is widely believed to be very hard to break these cryptographic primitives. For instance, if you can figure out an efficient way to factor very large numbers, then that would be a breakthrough result. For all practical purposes, you can act as though the cryptographic primitives are unbreakable. This means that, for all practical purposes, you can act as though the output of a cryptographic-strength pseudorandom number generator is basically the same as long as a sequence of true-random bits. In particular, this is a good source of the randomness needed by a randomized algorithm.

(I've glossed over the fact that, to use a crypto-strength PRNG, you still need to find 128 bits of true randomness on your own to form the seed. But usually this is not hard, and indeed, there are cryptographic tools to assist with that task as well.)

In practice, getting extremely good pseudorandom bits is as simple as

$ cat /dev/urandom

Outras dicas

It is well known that the efficiency of randomized algorithms (at least those in BPP and RP) depends on the quality of the random generator used. I have to disagree. What a good pseudurandom sequence ensemble gives you is a guarantee on the performance of the algorithm run a random sequence from the ensemble, compared to a trully random sequence. If you don't have such a guarantee, you can't conclude that the algorithm will work poorly - you just don't know.

Is there any justification of this common practice? It works.

Is it possible to argue that a Monte-Carlo randomized algorithm run on a perfect random source will turn into a well-behaved stochastic algorithm when run on a pseudorandom source? Complexity theory suffers from two shortcomings. The first one is that it is very hard to prove anything (for example, P vs. NP is open). The second one is that it is mostly concerned with worst-case analysis. Put together, these two limitations rule out complexity theory as a good model for the behavior of algorithms in practice.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange