Memory Address as Hash in HashTable. Why can't HasTable use actual memory address of the Key as Hash?

Question 1

If a class does not override hashCode(), and just inherits the default implementation from java.lang.Object, then in a typical JVM, its hashCode(), really will be, more or less, an internal pointer to it. (Obviously that's not the whole story, because the return-type of hashCode() is int, which wouldn't accommodate 64-bit JVMs; and these aren't real pointers to physical memory locations, firstly because the OS handles the mapping from virtual addresses to physical ones, and secondly because, even if the JVM handled that, garbage-collectors can move an object from one heap to another without affecting its hashCode(). But still, "internal memory address" is a good first approximation.)

The reason that most JDK classes override hashCode() is that we always want hashCode() to be "compatible" with equals(); that is, if a.equals(b), then we want a.hashCode() == b.hashCode(). (This makes sense when you consider that you generally don't want — for example — a Map<String, Object> to have two different "abc" entries just because the keys are two different String instances. And generally you'd want to be able to look up an entry by typing map.get("abc"), rather than needing to have the original instance of the key handy. If two keys are equal, then we usually want them to be treated as equal.)

If you really want pointer-equality in your map, you can use the java.util.IdentityHashMap class.

Question 2

The default Object.hashCode() is not strictly a memory address, but unless you have a huge memory, it's indeed unique among all the objects in the JVM, so you could see it as a "logical" address.

A HashMap has a limited number of buckets, and each key is indeed assigned a bucket based on its hash code. There is not one bucket per hash code. So even if two objects have a different hash code, they could end up in the same bucket. That's why it's important to have hashCode as well distributed as possible, to avoid such collisions.

Using the system identity hash code of a key (i.e. the hash code returned by Object.hashCode()) is not desirable most of the time, because you want two keys to be equal if they hold the same information, not if they are the same object instance. For example, if you store a student in a map based on his SSN, and later get the SSN of this student from some web service or from the database, you won't have the same STring instance, but you want to be able to find the student in the map using the SSN you received.

Question 3

Why can't HashTable use actual memory address of the Key as Hash?

Because key equality matters. When two objects are "equal" (one.equals(two) returns true), the hash codes must also be equal (one.hashCode() == two.hashCode()).

Question 4

The default hashCode() is not the memory address, it's a the "identity hash".

The memory address may change, but the identity is constant for a given instance.

Question 5

You can have any implementation of hashCode that you feel is optimum without breaking the rules laid down for hashCode and equals in the JavaSE apis.

Check equals and hashcode rules here : http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html

This is important as the Collections library and other apis heavily rely on this properties to implement their own behavior.

Question 6

The memory address should not be used as hashCode of an object (unless its equals method performs only an identity comparison). The reason is explicitly written in its JavaDoc:

If two objects are equal according to the {@code equals(Object)} method, then calling the {@code hashCode} method on each of the two objects must produce the same integer result.

In the case the the equals method performs only an identity comparison, the memory address is enough, as both hashCode() and equals() would be only equal for the same object.