Question

I'm trying to learn how to do some basic hashing in Javascript and I've come across the following algorithm:

var hash = 0;
for (i = 0; i < this.length; i++) {
    char = str.charCodeAt(i);
    hash = ((hash<<5)-hash)+char;
    hash = hash & hash;
}

I don't really understand how it works and I was hoping you could help me out. In particular I don't understand (hash<<5)-hash and hash = hash & hash. Thank you for your replies.

Note: For anyone looking for the source, it's an implementation of Java’s String.hashCode(): http://werxltd.com/wp/2010/05/13/javascript-implementation-of-javas-string-hashcode-method

Was it helpful?

Solution

The step

  hash = ((hash << 5) - hash) + char;

is effectively:

  hash = ((hash * 32) - hash) + char;

Then,

  hash = hash & hash;

will only change the value if the number has overflowed the integer range (32 bits, or maybe 31). (I wouldn't do it that way but it's a matter of style.)

In that code, the variables "i" and "char" should be declared:

var hash = 0, i, char;

OTHER TIPS

<< and & are bitwise operations. One is shifting bits and the other is anding them.

0010 << 2 becomes 1000 because everything is shifted to the left.

0101 & 1110 becomes 0100 because the result is 1 whenever both values have 1's for that particular bit.

@Pointy's answer explains what the hash = hash & hash (& the same value) accomplishes, I wasn't sure what it did.

See:

https://en.wikipedia.org/wiki/Bitwise_operation https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Bitwise_Operators

A common technique for hashing is to start with 0. Then you multiply the existing hash value by a prime number, and finally add the new element to it.

In this case:

((hash << 5) - hash)

is effectively "hash * 31". Apply the left shift operator, <<, 5 times is like multiplying the number by 2, 5 times. Or, multiplying it by 2^5, which is 32. Then they subtract once, giving 31.

The hash & hash is effectively a "do nothing" operation, performing a logical AND (&) on a number with itself returns the same number, it doesn't "do anything", but it may be used to coerce the number and make sure it remains an integer. There's likely some JS machinations going on with the representation of the number and that's what the second line is for.

If this was just raw C, it would simply be hash = hash * 31 + char.

hash << 5 is bitshifting the value of hash to the "left" 5 places. This is equivalent to hash * 32. For a hash-generation algorithm, it's easier to think of it in terms of moving the bits.

hash = hash & hash looks like a mistake. It "ands" the value of hash with itself and assigns it back to itself. Anding the value with itself results in the original value, so it's a form of nop.

The << represents the left shift operation. This operator shifts the first operand the specified number of bits to the left, for example:

var num = 2; // in binary 10
var shift = 2 << 2;  //shift is 8, in binary 1000

The & operator represents an AND logical operation, for example:

var num1 = 6; // in binary 110
var num2 = 7; // in binary 111
var result = num1 & num2; // result is 6, in binary 110

for a complete reference on bitwise operators take a look on the Mozilla javascript reference.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top