How does the JVM ensure that System.identityHashCode() will never change?

https://stackoverflow.com/questions/1063068

21-08-2019
|

Question

Typically the default implementation of Object.hashCode() is some function of the allocated address of the object in memory (though this is not mandated by the JLS). Given that the VM shunts objects about in memory, why does the value returned by System.identityHashCode() never change during the object's lifetime?

If it is a "one-shot" calculation (the object's hashCode is calculated once and stashed in the object header or something), then does that mean it is possible for two objects to have the same identityHashCode (if they happen to be first allocated at the same address in memory)?

Solution

Modern JVMs save the value in the object header. I believe the value is typically calculated only on first use in order to keep time spent in object allocation to a minimum (sometimes down to as low as a dozen cycles). The common Sun JVM can be compiled so that the identity hash code is always 1 for all objects.

Multiple objects can have the same identity hash code. That is the nature of hash codes.

OTHER TIPS

In answer to the second question, irrespective of the implementation, it is possible for multiple objects to have the same identityHashCode.

See bug 6321873 for a brief discussion on the wording in the javadoc, and a program to demonstrate non-uniqueness.

The header of an object in HotSpot consists of a class pointer and a "mark" word.

The source code of the data structure for the mark word can be found the markOop.hpp file. In this file there is a comment describing memory layout of the mark word:

hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)

Here we can see that the the identity hash code for normal Java objects on a 32 bit system is saved in the mark word and it is 25 bits long.

The general guideline for implementing a hashing function is :

the same object should return a consistent hashCode, it should not change with time or depend on any variable information (e.g. an algorithm seeded by a random number or values of mutable member fields
the hash function should have a good random distribution, and by that I mean if you consider the hashcode as buckets, 2 objects should map to different buckets (hashcodes) as far as possible. The possibility that 2 objects would have the same hashcode should be rare - although it can happen.

As far as I know, this is implemented to return the reference, that will never change in a objects lifetime .

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow