Question

I have a class Foo with two fields where the Equals and GetHashCode methods have been overridden:

public class Foo
{
    private readonly int _x;
    private readonly int _y;

    public Foo(int x, int y) { _x = x; _y = y; }

    public override bool Equals(object obj) {
        Foo other = obj as Foo;
        return other != null && _y == other._y;
    }

    public override int GetHashCode() { return _y; }
}

If I create an array of Foo:s and count the number of Distinct values of this array:

var array = new[] { new Foo(1, 1), new Foo(1, 2), new Foo(2, 2), new Foo(3, 2) };
Console.WriteLine(array.Distinct().Count());

The number of distinct values is recognized as:

2

If I now make my class Foo implement IEquatable<Foo> using the following implementation:

public bool Equals(Foo other) { return _y == other._y; }

The number of distinct values is still:

2

But if I change the implementation to this:

public bool Equals(Foo other) { return _x == other._x; }

The computed number of distinct Foo:s is neither 3 (i.e. the number of distinct _x) nor 2 (number of distinct _y), but:

4

And if I comment out the Equals and GetHashCode overrides but keep the IEquatable<Foo> implementation, the answer is also 4.

According to MSDN documentation, this Distinct overload should use the static property EqualityComparer.Default to define the equality comparison, and:

The Default property checks whether type T implements the System.IEquatable<T>
interface and, if so, returns an EqualityComparer<T> that uses that 
implementation. Otherwise, it returns an EqualityComparer<T> that uses the 
overrides of Object.Equals and Object.GetHashCode provided by T.

But looking at the experiment above, this statement does not seem to hold. At best, the IEquatable<Foo> implementation supports the already provided Equals and GetHashCode overrides, and at worst it completely corrupts the equality comparison.

My questions:

  • Why does the independent implementation of IEquatable<T> corrupt the equality comparison?
  • Can it play a role independent of the Equals and GetHashCode overrides?
  • If not, why does EqualityComparer<T>.Default look for this implementation first?
Was it helpful?

Solution

Your GetHashCode method only depends on y. That means if your Equals method doesn't depend on y, you've broken the contract of equality... they're inconsistent.

Distinct() is going to expect that equal elements have the same hash code. In your case, the only equal elements by x value have different hash codes, therefore Equals won't even get called.

From the docs of IEquatable<T>.Equals:

If you implement Equals, you should also override the base class implementations of Object.Equals(Object) and GetHashCode so that their behavior is consistent with that of the IEquatable<T>.Equals method.

Your implementation of Equals(Foo) isn't consistent with either Equals(object) or GetHashCode.

EqualityComparer<T>.Default will still delegate to your GetHashCode method - it will just use your Equals(T) method in preference to your Equals(object) method.

So to answer your questions in order:

  • Why does the independent implementation of IEquatable<T> corrupt the equality comparison?

Because you've introduced an inconsistent implementation. It's not meant to be independent in terms of behaviour. It's just meant to be more efficient by avoiding a type check (and boxing, for value types).

  • Can it play a role independent of the Equals and GetHashCode overrides?

It should be consistent with Equals(object) for the sake of sanity, and it must be consistent with GetHashCode for the sake of correctness.

If not, why does EqualityComparer<T>.Default look for this implementation first?

To avoid runtime type checking and boxing/unboxing, primarily.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top