Overriding GetHashCode()

https://stackoverflow.com/questions/11481323

20-06-2021
|

Question

In this article, Jon Skeet mentioned that he usually uses this kind of algorithm for overriding GetHashCode().

public override int GetHashCode()
{
  unchecked // Overflow is fine, just wrap
  {
    int hash = 17;
    // Suitable nullity checks etc, of course :)
    hash = hash * 23 + Id.GetHashCode();
    return hash;
  }
}

Now, I've tried using this, but Resharper tells me that the method GetHashCode() should be hashing using only read-only fields (it compiles fine, though). What would be a good practice, because right now I can't really have my fields to be read-only?

I tried generating this method by Resharper, here's the result.

public override int GetHashCode()
{
  return base.GetHashCode();
}

This doesn't contribute much, to be honest...

Solution

If all your fields are mutable and you have to implement GetHashCode method, I am afraid this is the implementation you would need to have.

public override int GetHashCode() 
{ 
    return 1; 
}

Yes, this is inefficient but this is at least correct.

The problem is that GetHashCode is being used by Dictionary and HashSet collections to place each item in a bucket. If hashcode is calculated based on some mutable fields and the fields are really changed after the object is placed into the HashSet or Dictionary, the object can no longer be found from the HashSet or Dictionary.

Note that with all the objects returning the same HashCode 1, this basically means all the objects are being put in the same bucket in the HashSet or Dictionary. So, there is always only one single bucket in the HashSet or Dictionary. When trying to lookup the object, it will do a equality check on each of the objects inside the only bucket. This is like doing a search in a linked list.

Somebody may argue that implementing the hashcode based on mutable fields can be fine if we can make sure fields are never changed after the objects added to HashCode or Dictionary collection. My personal view is that this is error-prone. Somebody taking over your code two years later might not be aware of this and breaks the code accidentally.

OTHER TIPS

Please note that your GetHashCode must go hand in hand with your Equals method. And if you can just use reference equality (when you'd never have two different instances of your class that can be equal) then you can safely use Equals and GetHashCode that are inherited from Object. This would work much better than simply return 1 from GetHashCode.

I personally tend to return a different numeric value for each implementation of GetHashCode() in a class which has no immutable fields. This means if I have a dictionary containing different implementing types, there is a chance the different instances of different types will be put in different buckets.

For example

public class A
{
    // TODO Equals override

    public override int GetHashCode()
    {
        return 21313;
    } 
}

public class B
{
    // TODO Equals override

    public override int GetHashCode()
    {
        return 35507;
    } 
}

Then if I have a Dictionary<object, TValue> containing instances of A, B and other types , the performance of the lookup will be better than if all the implementations of GetHashCode returned the same numeric value.

It should also be noted that I make use of prime numbers to get a better distribution.

As per the comments I have provided a LINQPad sample here which demonstrates the performance difference between using return 1 for different types and returning a different value for each type.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow