Question

I am trying to calculate a "score" for a word so that it will be used to determine it's lexicographical order in a Redis sorted set (words listed in alphabetical order).

Reading this post it says:

How to turn a word into a score?

For instance, if you want to use the first four letters to produce the score, this is the rule:

score = first-byte-value*(256^3) + second-byte-value*(256^2) + third-byte-value*(256^1) + fourth-byte-value

Just omit from the sum non existing chars if the word is < 4 chars in length.

Why this works? You are just considering the bytes as digits of a radis-256 number :)

With this theory I came up with the following code to test whether this would work in a PHP array:

$words = array('abcd', 'hello', 'dogs', 'hiya');
$newWords = array();

foreach ($words as $word) {
    $len = strlen($word);

    if ($len > 4) {
        $len = 4;
    }

    $i = 0;
    $j = $len - 1;
    $score = 0;

    while ($i < $len) {
        $byte = ord($word[$i]);

        if ($j == 0) {
            $score += $byte;
        }
        else {
            $score += $byte * (256 ^ $j);
        }

        $i++;
        $j--;
    }

    $newWords[$score] = $word;
}

ksort($newWords);
print_r($newWords);

However this returns:

Array
(
    [75950] => abcd
    [80858] => hello
    [81124] => dogs
    [85220] => hiya
)

Which is not in alphabetical order.

Can anyone spot the issue (obviously the score calculation is wrong)? I may have mis-understood the post :-/

Was it helpful?

Solution

I improved the code a bit and changed to use a pow instead

$words = array('abcd', 'hello', 'dogs', 'hiya');
$newWords = array(); 
foreach ($words as $word) {

$len = strlen($word);

    if ($len > 4) {
        $len = 4;
    }

    $i = 0;
    $j = $len - 1;
    $score = 0;

    while ($i < $len) {
        $byte = ord($word[$i]);
        $score += $byte * pow(256, $j);
        $i++;
        $j--;
    }

    $newWords[$score] = $word;
}
ksort($newWords);
print_r($newWords);

it does exactly what you expected:

Array ( [1633837924] => abcd [1685022579] => dogs [1751477356] => hello [1751742817] => hiya )

and you actually used XOR http://www.php.net/manual/en/language.operators.bitwise.php

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top