質問

I was working on a simulation of the St Petersburg Paradox when I realized my coin-flipping code never recorded any streaks of greater than 15 heads in a row. I ran the simulation 100,000,000 times, which should have resulted in an average of 1526 streaks of heads 16 long.

(0.5^16) x 100,000,000 = 1526

Clearly, something is wrong.

#include <stdlib.h>
#include <stdio.h>
#include <time.h>

int main(int argc, char const *argv[])
{
srand(time(0));

int i, lim = 100000000, streak = 0, maxstreak = 0;
for (i = 0; i < lim; ++i)
{
    if (rand()%2) {
        streak++;
        if (streak > maxstreak) maxstreak = streak;
    }
    else streak = 0;
}

printf("Ran %d times, longest streak of %d\n", lim, maxstreak);
return 0;
}

Returns the following every time:

Ran 100000000 times, longest streak of 15

Thanks for your help!

Edit: running GCC version 4.6.2 on Windows 7 x64. Bit new to programming in general.

Edit 2: thanks for everyone's help! Anyone sticking around, I wonder what about the current implementation would give a limit of 15 heads? How would the rand() function be so interestingly broken to produce this problem?

役に立ちましたか?

解決

Try choosing different seed values for your random number generator. Though rand() is a pretty good random number generator, it really is a pseudo-random number generator. You might want to read the man pages for rand (man -s3 rand), which clearly state that you should (for some implementations) use higher-order bits than the lower-order bits...

NOTES
   The versions of rand() and srand() in the Linux C Library use the  same
   random number generator as random(3) and srandom(3), so the lower-order
   bits should be as random as the higher-order bits.  However,  on  older
   rand()  implementations,  and  on  current implementations on different
   systems, the lower-order bits are much less  random  than  the  higher-
   order  bits.   Do  not use this function in applications intended to be
   portable when good randomness is needed.  (Use random(3) instead.)

Without knowing more about the system you are running your program on, we cannot know whether that is your problem. But try changing your code to use a different bit than the 2^0 bit.

Running your version works for me,

/coinflipsim 
Ran 100000000 times
head 50006650, streak 27
tail 49993350, streak 25

Here is code that works for me, using a different bit than bit 0,

int main(int argc, char const *argv[])
{
    srand(time(0));

    int i, lim = 100000000;
    int head=0, tail=0;
    int hstreak=0, tstreak=0;
    int hstreakmax=0, tstreakmax=0;
    for (i = 0; i < lim; ++i)
    {
        //if (rand()%2)
        if( rand() & (1<<13) ) //pick a bit, try different bits
        {
            head++;
            if( ++hstreak>hstreakmax) hstreakmax=hstreak;
            tstreak=0;
        }
        else {
            tail++;
            if( ++tstreak>tstreakmax) tstreakmax=tstreak;
            hstreak=0;
        }
    }
    printf("Ran %d times\n",lim);
    printf("head %d, streak %d\n",head,hstreakmax);
    printf("tail %d, streak %d\n",tail,tstreakmax);
    return 0;
}

Changing the rand()%2 line to this and rerun,

        if( rand() & (1<<13) ) //pick a bit, try different bits

Different results,

./coinflipsim 
Ran 100000000 times
head 50001852, streak 25
tail 49998148, streak 28

他のヒント

Your code is fine - it's your C library's rand() implementation that is apparently sub-par. Possibly there is a correlation among the low-order bits of the output, or possibly the internal state is very small (so that your 100,000,000 trials are actually covering the entire output sequence of the generator many times over).

In the first case (correlated output bits), you could postprocess the generator's output to "whiten" it, but in the second case you will need to plug in a better implementation, for example the Mersenne Twister.

Let X(i) be the event that there is a head on the ith flip. Let E(i) = union { X(j) | i <= j < i + 16 } be the event that a streak of 16 heads starts at i.

Your analysis assumes that the events E(i) are independent. This is not correct. If E(i) does not happen, that greatly decreases the probability of the directly preceding E(i-1), E(i-2), etc from happening.

It is correct to say that E(i) and E(j) are independent iff |i - j| >= 16.

It could be your random number generator isn't great. Since rand() does ultimately produce a deterministic pattern, it's possible for a random generator to produce a pattern that never gives you 16 even (or odd) numbers in a row.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top