Domanda

The hash function is explained on Wikipedia

It says, "The choice of a and n is critical to get good hashing;" and refers to a Linear congruential generator article that doesn't feel relevant. I cant figure out how the values are chosen. Any suggestions?

È stato utile?

Soluzione

The basis of this algorithm is that a nonzero polynomial of degree at most d has at most d zeros. Each length-k string has its own associated polynomial of degree k - 1, and we screen for possible matches by subtracting the polynomials of the strings in question and evaluating at a. If the strings are equal, then the result is always zero. If the strings are not equal, then the result is zero if and only if a is one of the zeros of the polynomial difference (this is the fact that puts the primality requirement on n, as the integers mod n otherwise would not be a field).

In theory, at least, we want a to be random so that an oblivious adversary cannot create false positives with any frequency. If we don't expect trouble, then it might be better to choose a so that multiplication by a is cheap (e.g., the binary expansion of a has a small number of one bits). Nevertheless, some choices are bad on typical string sets (e.g., a = 1). We want n to be large enough to avoid false positives (probability (k - 1)/n) by random chance but small enough and preferably of a special form so that the modulo computations are efficient.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top