Question

In the local object there is a collate facet.

The collate facet has a hash method that returns a long.
http://www.cplusplus.com/reference/std/locale/collate/hash/

Two questions:

  • Does anybody know what hashing method is used.
  • I need a 32bit value.
    If my long is longer than 32 bits, does anybody know about techniques for folding the hash into a shorter version. I can see that if done incorrectly that folding could generate lots of clashes (and though I can cope with clashes as I need to take that into account anyway, I would prefer if they were minimized).

Note: I can't use C++0x features
Boost may be OK.

Was it helpful?

Solution

No, nobody really knows -- it can vary from one implementation to another. The primary requirements are (N3092, §20.8.15):

For all object types Key for which there exists a specialization hash, the instantiation hash shall:

  1. satisfy the Hash requirements (20.2.4), with Key as the function call argument type, the DefaultConstructible requirements (33), the CopyAssignable requirements (37),
  2. be swappable (20.2.2) for lvalues,
  3. provide two nested types result_type and argument_type which shall be synonyms for size_t and Key, respectively,
  4. satisfy the requirement that if k1 == k2 is true, h(k1) == h(k2) is also true, where h is an object of type hash and k1 and k2 are objects of type Key.

and (N3092, §20.2.4):

A type H meets the Hash requirements if:

  1. it is a function object type (20.8),
  2. it satisifes the requirements of CopyConstructible and Destructible (20.2.1),
  3. the expressions shown in the following table are valid and have the indicated semantics, and
  4. it satisfies all other requirements in this subclause.

§20.8.15 covers the requirements on the result of hashing, §20.2.4 on the hash itself. As you can see, however, both are pretty general. The table that's mentioned basically covers three more requirements:

  1. A hash function must be "pure" (i.e., the result depends only on the input, not any context, history, etc.)
  2. The function must not modify the argument that's passed to it, and
  3. It must not throw any exceptions.

Exact algorithms definitely are not specified though -- and despite the length, most of the requirements above are really just stating requirements that (at least to me) seem pretty obvious. In short, the implementation is free to implement hashing nearly any way it wants to.

OTHER TIPS

If the implementation uses a reasonable hash function, there should be no bits in the hash value that have any special correlation with the input. So if the hash function gives you 64 "random" bits, but you only want 32 of them, you can just take the first/last/... 32 bits of the value as you please. Which ones you take doesn't matter since every bit is as random as the next one (that's what makes a good hash function).

So the simplest and yet completely reasonable way to get a 32-bit hash value would be:

int32_t value = hash(...);

(Of course this collapses groups of 4 billion values down to one, which looks like a lot, but that can't be avoided if there are four billion times as many source values as target values.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top