Question

Here's the sample code from Item 9:

public final class PhoneNumber {
  private final short areaCode;
  private final short prefix;
  private final short lineNumber;

  @Override
  public int hashCode() {
    int result = 17;
    result = 31 * result + areaCode;
    result = 31 * result + prefix;
    result = 31 * result + lineNumber;
    return result;
  }
}

Pg 48 states: "the value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as muiltiplication by 2 is equivalent to shifting."

I understand the concept of multiplication by 2 being equivalent to bit shifting. I also know that we'll still get an overflow (hence information loss) when we multiply a large number by a large odd prime number. What I don't get is why information loss arising from multiplication by large odd primes is preferable to information loss arising from multiplication by large even numbers.

Was it helpful?

Solution

With an even multiplier the least significant bit, after multiplication, is always zero. With an odd multiplier the least significant bit is either one or zero depending on what the previous value of result was. Hence the even multiplier is losing uncertainty about the low bit, while the odd multiplier is preserving it.

OTHER TIPS

There is no such thing as a large even prime - the only even prime is 2.

That aside - the general point of using a medium-sized prime # rather than a small one like 3 or 5 is to minimize the chance that two objects will end up with the same hash value, overflow or not.

The risk of overflow is not the issue per se; the real issue is how distributed the hashcode values will be for the set of objects being hashed. Because hashcodes are used in data structures like HashSet, HashMap etc., you want to minimize the # of objects that could potentially share the same hash code to optimize lookup times on those collections.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top