Question

I know the title sounds boring, because many people have already asked about this topic. I hope it can help me get some insight into how the random module works. The issue is, I wrote two different functions that I think should be identical, but the results I'm getting are not identical, and I don't understand why.

I hope to end up with a "well-shuffled deck." I only care about whether cards are red or black, so my decks are very simple. I am calling "1" red and "0" black.

My idea was to build the deck by appending a 1 (red) if random.random() is > .5, or a 0 (black) otherwise, and then just appending 1 or 0 automatically when I have reached 26 (half the deck) of one color. But something is going wrong. deckmaker() doesn't work properly, although deckmaker2() does. Can anyone provide insight?

import random

def deckmaker():
    deck = []
    for i in range(52):
        if deck.count(0) == 26:
            deck.append(1)
        elif deck.count(1) == 26:
            deck.append(0)
        elif random.random() > .5:
            deck.append(0)
        else:
            deck.append(1)
    return deck

def deckmaker2():
    newdeck = []
    for i in range(26):
        newdeck.append(0)
    for i in range(26):
        newdeck.append(1)
    deck = []
    for i in range(52):
        x = random.randint(0,len(newdeck)-1)
        deck.append(newdeck.pop(x))
    return deck

Note: While writing this question I discovered the random.shuffle list operator, which does the same thing as my second function, so of course getting the shuffled deck turns out to be easy. But I still wonder why my original code doesn't do the same thing.

Edited: Sorry for being vague about the exact problem with deckmaker(). The thing is, I don't exactly understand what's wrong. It has to do with the fact that on the decks it produces, as you "turn over" the cards one by one, there are strategies that let you predict whether the "next card" is going to be red or black that don't work with decks created using random.shuffle

Edit 2: [lots more information] I will explain how I determined that deckmaker doesn't work, in case that is important. I was writing this program to model the puzzle posted here: http://www.thebigquestions.com/2013/12/17/tuesday-puzzle-4/

My strategy was going to be to remember the last few cards dealt, and use that information to determine when to decide to take the next card. I thought maybe after getting 5 "black" cards in a row, it was a good time to predict "red." I implemented it like so:

mycards = []

for j in range(1000):
    mydeck = deckmaker(52)
    mem_length = 5
    mem = []
    for c in range(mem_length):
        mem.append(4)
    for i in range(len(mydeck)):
        if mem.count(0) == mem_length:
            mycards.append(mydeck[i])
            break
        elif i == len(mydeck)-1:
            mycards.append(mydeck[i])
            break
        else:
            mem.append(mydeck[i])
            mem.pop(0)


x = float(mycards.count(1))

print x/len(mycards)

The result more than half the cards I was taking (putting into the list mycards) were "red," a result I achieved by taking the card after 5 red cards in a row were drawn. This made no sense, so I looked for a different way to create the decks and got a more normal result. But I still don't know what was wrong with my original decks.

Was it helpful?

Solution

Generally speaking, you should never believe that a randomization approach works correctly unless you can rigorously prove it works correctly. And that's often hard.

To get some insight into your problem, let's generalize the problematic function:

import random

def deckmaker(n):
    half = n // 2
    deck = []
    for i in range(n):
        if deck.count(0) == half:
            deck.append(1)
        elif deck.count(1) == half:
            deck.append(0)
        elif random.random() > .5:
            deck.append(0)
        else:
            deck.append(1)
    return deck

And here's a little driver:

from collections import Counter
c = Counter()
for i in range(1000):
    c[tuple(deckmaker(2))] += 1
for t in sorted(c):
    print t, c[t]

Running that:

(0, 1) 495
(1, 0) 505

So the two possibilities are about equally likely. Good! Now try a deck of size 4; just change the relevant line like so:

c[tuple(deckmaker(4))] += 1

Running that:

(0, 0, 1, 1) 236
(0, 1, 0, 1) 127
(0, 1, 1, 0) 133
(1, 0, 0, 1) 135
(1, 0, 1, 0) 130
(1, 1, 0, 0) 239

Oops! You could run a formal chi-squared test if you like, but it's dead obvious by inspection that two permutations (the first and the last) are about twice as likely as the other four. So the output isn't even close to being arguably random.

Why is that? Think about it ;-)

Hint

For a deck of size 2*M, what's the chance that the first M entries are all 0? There are two answers to that:

  1. If all permutations of M zeroes and M ones are equal likely, the chance is 1 in (2*M)-choose-M (the number of ways to pick the positions of the M zeroes).

  2. In the way the function constructs a deck, the chance is 1 in 2**M (0 and 1 are equally likely in each of the first M positions).

Generally speaking, (2*M)-choose-M is very much larger than 2**M, so the function constructs a deck starting with all zeroes far more often than "it should". For a deck of 52 cards (M == 26):

>>> from math import factorial as f
>>> one = f(52) // f(26)**2
>>> two = 2**26
>>> float(one) / two
7389761.998476148

So "starts with 26 zeroes" is over 7 million times more likely than it should be. Cool :-)

Doing it "one at a time"

So is it possible to do this correctly picking a 0 or 1 one at a time? Yup! You just need to use the right probabilities: when there are nzero zeroes remaining to be picked, and nremaining total "cards" remaining to be picked, pick zero with probability nzero / nremaining:

def deckmaker(n=52):
    deck = [None] * n
    nremaining = float(n)
    nzero = nremaining / 2.0
    for i in range(n):
        if random.random() < nzero / nremaining:
            deck[i] = 0
            nzero -= 1.0
        else:
            deck[i] = 1
        nremaining -= 1.0
    return deck

Note that there's no need to count. When nzero becomes 0.0, the if test will never succeed (random() < 0.0 can't happen); and once we pick n/2 ones, nzero == nremaining will be true, and the if test will always succeed (random() < 1.0 is always true). It's cute ;-)

OTHER TIPS

I am not sure of that is what you mean, but I think the last cards generated by the first method are disproportionally likely to be the same.

The chance that you get 26 ones or 26 zeros out of 50 or fewer samples seems large (I think you could use a commutative binomial distribution to calculate it). Even a small chance would mean you should bet on the last cards being the same (and therefore the others being the opposite) colour.

What you could do is just add zeros and ones until there are 52 cards, and then at the end change random ones into zeros (or the other way) until there are 26 of each.

EDIT: Let's say the deck has 4 cards. The first two colours would be random:

25% (0, 0)
25% (0, 1)
25% (1, 0)
25% (1, 1)

But for (0, 0) and (1, 1), the last two cards are already determined and will be (1, 1) and (0, 0) respectively. The others need one more number to determine the outcome:

25.0% (0, 0, 1, 1)
12.5% (0, 1, 0) > (0, 1, 0, 1)
12.5% (0, 1, 1) > (0, 1, 1, 0)
12.5% (1, 0, 0) > (1, 0, 0, 1)
12.5% (1, 0, 1) > (1, 0, 1, 0)
25.0% (1, 1, 0, 0)

So as you can see, not all options are equally likely. The difference is smaller for 52 cards of course, but the idea remains the same.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top