Why would map be much faster than unordered_map?

https://stackoverflow.com/questions/4846798

27-10-2019
|

Question

I implemented a search caching results that consist of keys of type State (a class with 7 short ints) and values of type Socre (a class of 3 doubles.) Using unordered_map was at least 20 times slower than map. Why?

Edit: Darn it! My hash function was

namespace std {
    size_t hash<State>::operator()(State const& s) const {
        size_t retval = hash<short>()(s.s[0]);
        for (int i = 1; i < R; i += 2) {  // 1 3 5
            int x = (static_cast<int>(s.s[i + 1]) << 16)
                + (static_cast<int>(s.s[i]));
            hash_combine(retval, x);
        }
    }
}

I forgot to return retval, so it was all colliding! I wish unordered_map had a hash_function_quality() function that reports the average number of collisions.

Solution

The speed of unordered_map is directly proportional to the speed of your hashing function. It is never a straight forward relationship. Case in point, if you use the simplest hashing function:

std::size_t myHash(MyObjectType _object){ return 1; }

then what you'll end up with is a collection which behaves like a list rather than a hashed container. All the items will map to a single bucket and you'll have to traverse the entire bucket until you get to the item you desire (something that could take O(N) time.)

What you need to do is look at two things:

What hashing function are you using? Does it cost a ridiculous amount of time to process?
How many collisions is it producing? That is, how many unique elements are being mapped to the same hash value?

Either of those by themselves can and will kill the performance.

OTHER TIPS

std::unordered_map is commonly slow for a small number of elements because of the hash function. It takes a fixed (-ish) amount of time, but maybe a significant amount of time nonetheless.

std::map on the other hand is simpler than std::unordered_map. The time it takes accessing an element there depends on the count of elements, but less and less so as the number of elements grows. And the big-oh factor c for a std::map is commonly very small too, compared to std::unordered_map.

In general, prefer using std::map over std::unordered_map, unless you have a specific reason to use std::unordered_map. This holds particularly if you don't have a large number of elements.

unordered_map uses a hash table under the hood, so most obvious reason why hash performs poorly is because you have too many collisions. You may consider using different, non-default, hash function that will give better results for your type of keys.

For

I wish unordered_map had a hash_function_quality() function that reports the average number of collisions.

I think the following function might be helpful.

unordered_map::load_factor
    float load_factor() const;
The member function returns the average number of elements per bucket.

Lower the load_factor, better is the hash function.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow