Who says it hashes well?
A hash function maps an input, which in this case is a string, to an output, in this case an unsigned int
. The size of the input is (number of usable characters) ^ number of characters in the string
where ^
is "raised to the power of".
If your input string can contain only the characters 0
and 1
then the size of the input would be 2^ number of characters in the string
The size of the output is fixed, at the largest number representable in an unsigned int
.
This means that there is a "number of characters in the string" where the size of the input will be greater than the size of the output. By the pigeon hole principle you will definitely start having collisions. In reality, you probably had collisions before this threshold is reached.
If you want to use a hash function in your hash_map
or any other data structure, make sure it is tuned to your specific input. Don't go picking up the first one you find in the internet. A good hash function provides as few collisions as possible for your specific inputs.
A general purpose hash function may be sub-optimal in your specific case. A hash function specifically designed for some inputs (and this may very well be such a function) may work significantly worse on your inputs.