Overriding equals() method in Java

https://softwareengineering.stackexchange.com/questions/272050

07-10-2020
|

Domanda

Short question: Why does Java allow overriding equals(), why is it not final?

I am reading Effective Java 2nd edition by Joshua Bloch. I am a bit baffled by the conclusion that

There is no way to extend an instantiable class and add a value component while preserving the equals contract

Is not it saying the same as equals() should be a final method?

The book gives an example of a class A with equals() method and then a class AX extending A having its own equals() method which is different from the equals() method in A.

I will not go into details of equals() in A and equals() in AX, but it suffices to say that they are different. Therefore we have inter-operatability problem which guarantees violation of transitivity and/or symmetry (maybe even something else) of equals() when we mix different implementations of A in some contexts (especially HashSet, HashMap type).

Thinking further, I don't think I can agree with with the conclusion that having something like

public boolean equals(Object o) {
    if (o == null || o.getClass() != getClass())
      return false;
    ...
}

is wrong. I think this is precisely the proper way to deal with overriding equals().

Java makes it possible so Java allows overriding equals() for a reason. If it had taken into account Liskov substitution principle in the strict sense, then it would not have allowed overriding equals() and implicitly makes any implementation of equals() final at the compiler level. What are your thoughts?

I can think of a case where composition is simply not suitable, and overriding equals() is the best option. This is the case where the class A is to be made persistent in a database and the context implies that there is no collection having both an implementation of A and subclasses of A such as AX.

Soluzione

equals() is a byproduct of an attempt to improve C++ when it was created. C++ has operator overloading which allows you to perform custom operations when called with otherwise valid operators such as <, >, !=, ==, and even =.

The team made the decision (wisely so) to make equality be class method rather than having external static methods as it was done in C++. However, this also meant that equals() coupled with hashCode() defined how such classes were handled in collection classes.

Since any class could in theory override equals() or hashCode(), it means that just because you have a collection of a certain type does not guarantee that behavior is uniform.

For instance, suppose class A has two members x and y used to determine equality. Along comes class B which has x, y, and z. If an instance of B had the same values x and y, how would you go about inserting this instance in a Set? If you call the equals of an instance of A, it will determine the two to be equal and if you call the equals of an instance of B, it will return false since it is not an instance of B.

To be perfectly correct, class B would have to treat member z as an additional condition only in the case in which it is an instance of B, otherwise it lends itself to the equals() method of class A, and, if such a thing is not possible, class B should not allow itself to override equals() or hashCode(). This creates a sticky situation since in theory you should not concern yourself with how the parent class works from an implementation standpoint (if done right anyway), but yet here we are.

You could make class A final to prevent such things from happening, but then of course you can never extend class A. Java makes a point of making certain standard classes like String final to prevent complications of this nature (very smart decision on their part). I think at the end of the day, what matters is that you are very careful in your usage of equals() and hashCode(). I try to use it sparingly, and I am always mindful of which classes I mean to be available in a library and which classes are for internal use as to not to create conditions where things could go horribly wrong.

The Liskov substitution principle is fine in theory, but in practice you can never quite manage it. Take Collection as an example. Collection is implemented by ArrayList, Set, or LinkedList among others. While it is true that you could achieve the same ultimate goal by replacing a Collection with say a HashSet, it is not an ideal implementation for performing operations on all objects contained within (better LinkedHashSet at that point). It wouldn't break existing code, but you may potentially render it grossly inefficient depending on how that Collection is used. Consider that this is a rather clean example too.

If you're lucky, only the implementation details change, but many behave radically different, with some methods throwing a NotImplementedException.

Thus requiring that classes implementing equals() must respect the Liskov substitution principle is asking a lot, and I suspect that they didn't want to alienate the majority of C++ programmers getting familiar with Java.

Altri suggerimenti

Simple argument against your logic is, what is the use of the equals method if it's modified to be a final. Then it would execute same logic against any given objects. If you want to compare A and AX in the object class what would be the logic?

Overriding of final method is allowed to be able to verify if two different objects (living in 2 different memory locations) are actually equals (== method is there to check if two references are of the same object) based on defined logic. For example if you have a user object like this,

User
{
    string name;
    string email;
    string phone;
}

For one system, it would be the email you might interest to uniquely identifying a user. So you may compare based on email value. But some other system might use phone as the unique identifier. So they may decide to implement different equals methods.

When it comes to logic of A,AX and Liksov substitution principle which is,

Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.

So if you do,

AX ax1 = new AX();
AX ax2 = new AX();
ax1.equals(ax2);

and

A a = new AX();
AX ax = new AX();
a.equals(ax);

should give you the same output, and it will.

To give a short answer to your short question - it's because the equals() function defined for java.lang.Object is largely useless. It's just equivalent to "==" in that it returns true if the two references refer to the same object, and doesn't really test if they are "equal" in the normal sense of the word.

Thus, if you actually want to compare two objects to see if they hold equal data, you need to override the default implementation of equals(). You normally should override hashCode() as well.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange