Question

I'm trying to cache the result of an expensive function in a MemoryCache object.

The MemoryCache requires a key that is a string, so I was wondering if it was valid to do the following:

string key = Char.ConvertFromUtf32(myObject.GetHashCode());
if (!_resourceDescriptionCache.Contains(key))
{
    _resourceDescriptionCache[key] = ExpensiveFunction(myObject);
}
return (string)_resourceDescriptionCache[key];

It feels odd using a single UTF32 character as the key for a potentially large cache.

Was it helpful?

Solution

That depends.

There are many cases where using GetHashCode() could cause incorrect behavior:

A hash code is intended for efficient insertion and lookup in collections that are based on a hash table. A hash code is not a permanent value. For this reason:

  • Do not serialize hash code values or store them in databases.
  • Do not use the hash code as the key to retrieve an object from a keyed collection.
  • Do not send hash codes across application domains or processes. In some cases, hash codes may be computed on a per-process or per-application domain basis.

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

If the memory cache happens (or can in the future happen) in a different process or app domain than the code that calls it, you fail the 3rd condition.

It feels odd using a single UTF32 character as the key for a potentially large cache.

If you are caching enough things, the collision rate on a 32-bit hash can be uncomfortably high due to the Birthday Problem.

When caching tens of millions of things, I have used a 64-bit hash called City Hash (created by Google, open source) with good success. You can also use a Guid, though the memory to maintain keys is twice as large for a GUID compared to a 64-bit hash.

OTHER TIPS

Hashcodes can collide. return 0; is a valid implementation for GetHashCode. Multiple keys will share a cache slot which is not what you desire... You will confuse objects.

If your code does not work with return 0; as the implementation for GetHashCode your code is broken.

Choose a better cache key.

The memory cache is backed by the normal C# Dictionary. It really isn't much different, other than the fact that it provides expiration

The chances of a collision are 2^32, which is the size of an integer. Even if you do manage to get a collision, the dictionary has safety measures around that (by using the Equals on a collision)

Edit: The key collisions are only handled when a dictionary is given the unaltered key (ex: Dictionary()). In this case, since the MemoryCache uses strings, theres no collision detection.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top