Question

Consider following inputs:

1,1,2,3,5,8 - it isn't random

2,4,8,16,32 - this neither

4,1,2,11,5,9- this one looks like random-sequence

I would like to ask if is there such algorithm to prove if input is random or it isn't?

Was it helpful?

Solution

No, there is no such prove - if you have perfectly random numbers, the probability of each sequence of length n is equal. However, there are statistical tests to asses the quality of a random number generator, which is probably what you are looking for. See Diehard tests.

OTHER TIPS

As others have mentioned, you cannot prove that a sequence was generated randomly or not. In addition to what @timos suggested, it looks like you might have some additional requirements. It seems as though you desire to ensure that the sequence does not come from some hypothetical list of well-known, "non-random" sequences. If that is the case, you may be interested in learning of the On-Line Encyclopedia of Integer Sequences.

If you have a particular sequence in mind, you can check it against their database. For instance, 1,1,2,3,5,8 comes up in a number of sequences, but most prominantly in A000045 (Fibonacci numbers). A sequence like 4,1,2,11,5,9 does not show up in a search there.

None of this proves anything, but perhaps this is more in-line with your goals in this instance.

I want to emphasize here that the word "random" means not only identically distributed but also independent of everything else (including independent of any other choice).

There are numerous "randomness tests" available, including tests that estimate p-values from running various statistical probes, as well as tests that estimate min-entropy, which is roughly a minimum "compressibility" level of a bit sequence and the most relevant entropy measure for "secure random number generators". There are also various "randomness extractors", such as the von Neumann and Peres extractors, that could give you an idea on how much "randomness" you can extract from a bit sequence. However, all these tests and methods can only be more reliable on the first part of this definition of randomness ("identically distributed") than on the second part ("independent").

In general, there is no algorithm that can tell, from a sequence of numbers alone, whether the process generated them in an independent and identically distributed way, without knowledge on what that process is. Thus, for example, although you can tell that a given sequence of bits has more zeros than ones, or has no obvious pattern in them, you can't tell whether those bits—

  • Were truly generated independently of any other choice, or
  • form part of an extremely long periodic sequence that is only "locally random", or
  • were simply reused from another process, or
  • were produced in some other way,

...without more information on the process. As one important example, the process of a person choosing a password is rarely "random" in this sense since passwords tend to contain familiar words or names, among other reasons.

See also this question: A Good and SIMPLE Measure of Randomness

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top