Question

I am building a composite key for a hash map in java and want to determine my own hash code for each of these objects. My question is what is the best methodology of the two below. My composite key has three String attributes and one int attribute.

public int hashCode(){
    return (className + methodName + uniqueNumber).hashCode();
}

public int hashCode(){
    return (className + methodName + desc + uniqueNumber).hashCode();
}

I must have className, methodName, and unique number to guarantee that each key has a unique hash code. I want to go with the method that gives the least chance of a collision. My intuition is that the more attributes that I "add" to my hash map function that less likely a collision will occur. However, I am not entirely certain this is correct.

Was it helpful?

Solution

Your question is a bit unclear, as to what fields you need/ are sufficient to uniquely distinguish the key.

Generally, you should combine individual hashes (within a composite key) by multiplying by prime factors.

Assuming the first example:

public int hashCode() {
    int h = className.hashCode() * 23;
    h += methodName.hashCode() * 17;
    h += uniqueNumber;
    return h;
}

OTOH if uniqueNumber is actually unique, you can simplify:

public int hashCode() {return uniqueNumber;}

In your comment you mentioned one thing: "Using only uniqueNumber will generate a unique hash value, but I will lose my ability to reference specific values in my hashmap".

Now this is very important: "Instance Identity" is a very different thing to hash on & lookup, from "Value"! You cannot use the same hashcode & maps for both.

For example, if you need a Key( ClassName, MethodName) -> SomeValue lookup that would be a "value" lookup & would need to be hashed by ClassName & MethodName values so that it could be repeated: ie, so you can construct a key for Map.get() to perform a lookup.

"Instance Identity" actually has builtin support for hashing & maps in Java -- it's called IdentityHashMap.

But for most cases, including & especially Composite Keys which are presumably to be used for a map, the key needs to be able to be re-constructed to later perform a lookup. So the key should have value semantics, and it is dubious whether your uniqueNumber should actually be part of the key.

When you go to do a lookup later, how do you get the correct uniqueNumber to retrieve the data? My feeling is that:

  1. Either there should be a first-class entity there instead, which you could use as the key directly (so no CompositeKey class required any more), or that

  2. You can't repeatably get uniqueNumber, in which case it doesn't work/ isn't required anyway.

To summarize: if uniqueNumber is really required or applicable at all, I would expect it to already be encapsulated in a first-class entity. That's not the case. It looks like you should most probably be using a value-based key, and dropping the uniqueNumber bit (from here at least).

So my recommendation:

public int hashCode() {
    int h = className.hashCode() * 23;
    h += methodName.hashCode() * 17;
    h += desc.hashCode();
    return h;
}

Let me know if this helps.

OTHER TIPS

A few comments;

(1) It is not necessary for hash codes to be unique. In fact, they usually are NOT guaranteed to be unique. In most circumstances, it would be too computationally expensive to guarantee uniqueness, nor would it be desirable. Collisions are not catastrophic.

(2) Hash codes should reflect the state of the object instance, not the object class. Things like the class name will not enter into it. Unless, of course, that IS the instance data of a class, such as in a class that represents one frame of a stack trace, perhaps.

(3) A good hash code will have a large number of possible values, and these values will be distributed probabilistically such that collisions are UNLIKELY.

(4) In Java, a hash code must be consistent with Object.equals(). See the Javadoc for java.lang.Object for reference.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top