質問

I'm doing a game with playing cards and have to store shuffled decks in MySQL.

What is the most efficient way to store a deck of 52 cards in a single column? And save/retrieve those using PHP.

I need 6 bits to represent a number from 0 to 52 and thus thought of saving the deck as binary data but I've tried using PHP's pack function without much luck. My best shot is saving a string of 104 characters (52 zero-padded integers) but that's far from optimal.

Thanks.

役に立ちましたか?

解決

I do agree that it's not necessary or rather impractical to do any of this, but for fun, why not ;)

From your question I'm not sure if you aim to have all cards encoded as one value and stored accordingly or whether you want to encode cards individually. So I assume the former.

Further I assume you have a set 52 cards (or items) that you represent with an integer value between 1 and 52.

I see some approaches, outlined as follows, not all actually for the better of using less space, but included for the sake of being complete:

  1. using a comma separated list (CSV), total length of 9+2*42+51 = 144 characters
  2. turning each card into a character ie a card is represented with 8 bits, total length of 52 characters
  3. encoding each card with those necessary 6 bits and concatenating just the bits without the otherwise lost 2 bits (as in the 2nd approach), total length of 39 characters
  4. treat the card-ids as coefficients in a polynomial of form p(cards) = cards(1)*52^51 + cards(2)*52^50 + cards(3)*52^49 + ... + cards(52)*52^0 which we use to identify the card-set. Roughly speaking p(cards) must lie in the value range of [0,52^52], which means that the value can be represented with log(52^52)/log(2) = 296.422865343.. bits or with a byte sequence of length 37.052 respectively 38.

There naturally are further approaches, taking into account mere practical, technical or theoretical aspects, as is also visible through the listed approaches.

For the theoretical approaches (which I consider the most interesting) it is helpful to know a bit about information theory and entropy. Essentially, depending on what is known about a problem, no further information is required, respectively only the information to clarify all remaining uncertainty is needed. As we are working with bits and bytes, it mostly is interesting for us in terms of memory usage which practically speaking is bit- or byte-based (if you consider only bit-sequences without the underlying technologies and hardware); that is, bits represent one of two states, ergo allow the differentiation of two states. This is trivial but important, actually :/

then, if you want to represent N states in a base B, you will need log(N) / log(B) digits in that base, or in your example log(52) / log(2) = 5.70.. -> 6 bits. you will notice that actually only 5.70.. bits would be required, which means with 6 bits we actually have a loss. Which is the moment the problem transformation comes in: instead of representing 52 states individually, the card set as a whole can be represented. The polynomial approach is a way to do this. Essentially it works as it assumes a base of 52, ie where the card set is represented as 1'4'29'31.... or mathematically speaking: 52 + 1 = 1'1 == 53 decimal, 1'1 + 1 = 1'2 == 54 decimal, 1'52'52 + 1 = 2'1'1 == 5408 decimal,

But if you further look at the polynomial-approach you will notice that there is a total of 52^52 possible values whereas we would only ever use 52! = 1*2*3*...*52 because once a card is fixed the remaining possibilities decrease, respectively the uncertainty or entropy decreases. (please note that 52! / 52^52 = 4.7257911e-22 ! which means the polynomial is a total waste of space).

If we now were to use a value in [1,52!] which is pretty much the theoretical minimum, we could represent the card set with log(52!) / log(2) = 225.581003124.. bits = 28.1976.. bytes. Problem with that is, that any of the values represented as such does not contain any structure from which we can derive its semantics, which means that for each of the 52! possible values (well 52! - 1, if you consider the principle of exclusion) we need a reference of its meaning, ie a lookup table of 52! values and that would certainly be a memory overkill.

Although we can make a compromise with the knowledge of the decreasing entropy of an encoded ordered set. As an example: we sequentially encode each card with the minimum number of bits required at that point in the sequence. So assume N<=52 cards remain, then in each step a card can be represented in log(N)/log(2) bits, meaning that the number of required bits decreases, until for the last card, you don't need a bit in the first place. This would give about (please correct)..

20 * 6 bits + 16 * 5 bits + 8 * 4 bits + 4 * 3 bits + 2 * 2 bits + 1 bit = 249 bits = 31.125.. bytes

But still there would be a loss because of the partial bits used unnecessarily, but the structure in the data totally makes up for that.

So a question might be, hej can we combine the polynomial with this??!?11?! Actually, I have to think about that, I'm getting tired.

Generally speaking, knowing about the structure of a problem drastically helps decreasing the necessary memory space. Practically speaking, in this day and age, for your average high-level developer such low level considerations are not so important (hej, 100kByte of wasted space, so what!) and other considerations are weighted higher; also because the underlying technologies are often reducing memory usage by themselves, be it your filesystem or gzip-ed web-server responses, etc. The general knowledge of these kind of things though is still helpful in creating your services and datastructures.

But these latter approaches are very problem-specific "compression procedures"; general compression works differently, where as an example approach the procedures sequencially run through the bytes of the data and for any unseen bit sequences add these to a lookup table and represent the actual sequence with an index (as a sketch).

Well enough of funny talk, let's get technical!

1st approach "csv"

// your unpacked card set
$cards = range(1,52);

$coded = implode(',',$cards);
$decoded = explode(',',$coded);

2nd approach: 1 card = 1 character

// just a helper
// (not really necessary, but using this we can pretty print the resulting string)
function chr2($i)
{
    return chr($i + 32);
}

function myPack($cards)
{
    $ar = array_map('chr2',$cards);
    return implode('',$ar);
}


function myUnpack($str)
{
    $set = array();
    $len = strlen($str);
    for($i=0; $i<$len; $i++) 
        $set[] = ord($str[$i]) - 32; // adjust this shift along with the helper

    return $set;
}

$str = myPack($cards);
$other_cards = myUnpack($str);

3rd approach, 1 card = 6 bits

$set = ''; // target string
$offset = 0;
$carry = 0;

for($i=0; $i < 52; $i++)
{
    $c = $cards[$i];

    switch($offset)
    {
        case 0:
            $carry = ($c << 2); 
            $next = null;
            break;

        case 2:
            $next = $carry + $c;
            $carry = 0;
            break;

        case 4:
            $next = $carry + ($c>>2);
            $carry = ($c << 6) & 0xff; 
            break;

        case 6:
            $next = $carry + ($c>>4);
            $carry = ($c << 4) & 0xff;
            break;      
    }

    if ($next !== null)
    {
        $set .= chr($next);
    }

    $offset = ($offset + 6) % 8;
}

// and $set it is!


$new_cards = array(); // the target array for cards to be unpacked
$offset = 0;
$carry = 0;

for($i=0; $i < 39; $i++)
{
    $o = ord(substr($set,$i,1));

    $new = array();
    switch($offset)
    {
        case 0:
            $new[] = ($o >> 2) & 0x3f;
            $carry = ($o << 4) & 0x3f;
            break;
        case 4:
            $new[] = (($o >> 6) & 3) + $carry;
            $new[] = $o & 0x3f;
            $carry = 0;
            $offset += 6;
            break;
        case 6:
            $new[] = (($o >> 4) & 0xf) + $carry;
            $carry = ($o & 0xf) << 2;
            break;
    }

    $new_cards = array_merge($new_cards,$new);

    $offset = ($offset + 6) % 8;
}

4th approach, the polynomial, just outlined (please consider using bigints because of the integer overflow)

$encoded = 0;
$base = 52;
foreach($cards as $c)
{
    $encoded = $encoded*$base + $c;
}

// and now save the binary representation


$decoded = array();
for($i=0; $i < 52; $i++)
{
    $v = $encoded % $base;
    $encoded = ($encoded - $v) / $base;
    array_shift($v, $decoded);
}
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top