Question

I know now a days inbuilt utilities available like HashCodeBuilder from Apache commons lang but i was trying to understand how to implement it myself and came across the example of hascode function for Employee class at http://en.wikipedia.org/wiki/Java_hashCode()

Everywhere on google, same kind of technique is suggested like multiplying non zero value with odd prime number and then summing it with instance variable(do it for instance variable).

Questions:-

1)why we can't return employeeId as hascode becoz it will aways be unique. Its simple and serves the hascode purpose. Agreed if it is not unique probably we need that kind of technique. Is that right?

2)Even If employee id is not unique, why its suggested to multiply with odd prime number? why taking any damn integer is not considered good?

Update:-

Peter i ran the example you mentioned it printed

[0, 32, 64, 96, 128, 160, 192, 224, 288, 256, 352, 320, 384]

[0, 32, 64, 96, 128, 160, 192, 224, 288, 256, 352, 320, 384]

i assume that output for now as yoy expected to understand the concept as you mentioned in your answer

[373, 343, 305, 275, 239, 205, 171, 137, 102, 68, 34, 0]

[0, 34, 68, 102, 137, 171, 205, 239, 275, 305, 343, 373]

As you suggested in your comment that this example demonstrated even unique hashcode can end up in same bucket. How this example demonstrated this behaviour? Do you mean 373 for integers and 0 for integers2 end up in same bucket ?

How prime number is helping in this example and how 34 would not have helped?

Was it helpful?

Solution

why we can't return employeeId as hascode becoz it will aways be unique. Its simple and serves the hascode purpose. Agreed if it is not unique probably we need that kind of technique. Is that right?

It's uniqueness is not important. Multiplying by a prime is a good way of merging multiple fields into one hashCode, but it sounds like you only have one, so it wont make much difference.

Even If employee id is not unique, why its suggested to multiply with odd prime number? why taking any damn integer is not considered good?

If you multiply by an even number what will the lowest bit of the hashCode be? How random/useful is it?


Note: every hashCode() for an Integer is unique, but get the right combination of integer values and when they are reduced to the capacity of a HashMap, they actually map to the same bucket. In this example, the entries appear in the reverse order they were added because every entry mapped to the same bucket.

HashSet<Integer> integers = new HashSet<>();
for (int i = 0; i <= 400; i++)
    if ((hash(i) & 0x1f) == 0)
        integers.add(i);
HashSet<Integer> integers2 = new HashSet<>();
for (int i = 400; i >= 0; i--)
    if ((hash(i) & 0x1f) == 0)
        integers2.add(i);
System.out.println(integers);
System.out.println(integers2);


static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

prints

[373, 343, 305, 275, 239, 205, 171, 137, 102, 68, 34, 0]
[0, 34, 68, 102, 137, 171, 205, 239, 275, 305, 343, 373]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top