Question

I've been developing in Java with Netbeans for some time now, and there are some things I just rely on working without really questioning how. Among these are the automatically generated hashCode() and equals() methods.

The equals method is straightforward to follow, but I find the hashCode method somewhat enigmatic. I don't understand why it chooses the multipliers and applies the operations it does.

import java.util.Arrays;
import java.util.Objects;

public class Foo {

    int id;
    String bar;
    byte[] things;

    @Override
    public int hashCode() {
        int hash = 7;
        hash = 89 * hash + this.id;
        hash = 89 * hash + Objects.hashCode(this.bar);
        hash = 89 * hash + Arrays.hashCode(this.things);
        return hash;
    }    
}

Searching the documentation, this site, and Google for things like "netbeans generate hashcode" turned up nothing that seemed relevant. Is anyone here familiar with what this generation strategy is and why Netbeans uses it?

Edit:
Thanks for the answers so far! Especially due to this answer on the linked SO question, I understand the logic behind using primes in designing a hashCode method much more fully now. However, the other aspect of my question that nobody has really addressed so far is how and why Netbeans chooses the prime numbers that it does for its generated methods. The hash field and the other multiplier (89 in my example) seem to be different depending on various factors of the class.

For example, if I add a second String to the class, hashCode() becomes

public int hashCode() {
    int hash = 7;
    hash = 13 * hash + this.id;
    hash = 13 * hash + Objects.hashCode(this.bar);
    hash = 13 * hash + Objects.hashCode(this.baz);
    hash = 13 * hash + Arrays.hashCode(this.things);
    return hash;
}

So, why does Netbeans choose these specific primes, as opposed to any other ones?

Was it helpful?

Solution

This is an optimization aiming to better distribute the hash values. Eclipse does it similarly. Have a look at Why use a prime number in hashCode? and Why does Java's hashCode() in String use 31 as a multiplier?.

This is in no way required. Even return 0; is sufficient in order to fulfill the equals/hashcode contract. The only reason is that hash based data structures perform better with good distributed hash values.

Some would call this premature optimization. I guess it's ok since its a) for free (generated) and b) widely recognized (almost every IDE does it).

OTHER TIPS

IBM has an article on how to write your own equals() and hashCode() methods. What they're doing is fine, though 31 tends to be a better prime because the multiplication can be optimized better.

Also have a look at how String.hashCode() works. It's exactly that, but with different primes and homogeneous types.

From Joshua Bloch's item 9 of, Effective Java 2nd ed., the important thing to remember is to always override hashCode() when you override equals() to ensure that equal objects will have equal hash codes--otherwise you might easily violate this contract. While he says that a state of the art hash function is a topic for doctoral research, the recipe he gives for a good general purpose hashCode might, in your case, yield:

@Override
public int hashCode() {
    int result = 17;
    result = 31 * result + id;
    result = 31 * result + bar.hashCode();
    result = 31 * result + Arrays.hashCode(things);
    return result ;
}  

As mentioned by @zapl and David Ehrmann, the compiler can easily optimize the multiplication of 31 to a bit shift and minus 1 operation, so that may work out to be a little faster if that's important.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top