std::type_info::hash_code() uniqueness and the meaning of "should"

https://stackoverflow.com/questions/16253207

13-04-2022
|

Question

Is it meant to be guaranteed that same std::type_info::hash_code() values imply same types?

This function returns the same value for any two type_info objects that compare equal, and different values for distinct types that do not. [Emphasis mine]

Cppreference seems to claim otherwise:

Returns an unspecified value, which is identical for objects, referring to the same type. No other guarantees are given, in particular, the value can change between invocations of the same program. [Emphasis mine]

The relevant standards paragraphs are:

§ p18.7.1 p7-8

size_t hash_code() const noexcept;

7 Returns: An unspecified value, except that within a single execution of the program, it shall return the same value for any two type_info objects which compare equal.

8 Remark: an implementation should return different values for two type_info objects which do not compare equal. [Emphasis mine]

What's the meaning of "should" supposed to be in the context above? If paragraph 8 is meant to be a requirement, then it seems impossible to fulfill unless the runtime does some kind of global uniquing over all symbol names in a program to ensure lack of hash collision, which seems to be a pretty big burden for the standard to foist upon implementations, especially for a function called hash_code(). (Itanium actually requires this, but it's explicitly an extra requirement above the standard.)

If "should" is not meant to be binding, then the remark seems to be a pointless one and a defect in the standard, since asking implementations to try to fulfill a difficult requirement that cannot be relied upon anyway provides no value and only invites confusion and fragmentation. Anyone know why it's there?

EDIT: Maybe "defect" was too strong a word, but at least it's a point of possible confusion that should be clarified, since it's apparently misled at least one reference site and transitively misled anyone relying upon it. Furthermore, it actually is possible to fulfill the requirement (as long as the number of types supported by the implementation is smaller than the range of size_t) if global uniquing is done at runtime, and it's unclear if the standard is trying to suggest this as the ideal implementation strategy or not.

Solution

The meaning looks pretty clear to me: it's not an absolute requirement because it may be impossible to meet under some circumstances, but the implementation should attempt to produce unique values to the extent possible.

I'd note that the same is true of hash codes in general -- you try to produce values that are unique, but it's not always possible.

The standard contains a lot of information that's not enforceable. Quite a bit (but certainly not all) is in the form of explicit Notes, but that doesn't mean everything non-normative outside a Note is a defect.

Edit: in case anybody wants to know what the ISO says about how standards should be written, they have a page of guidelines.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow