My instructor dumped this on us, and told us we just needed to google how to write a hash function. I am quite directionless on this. We wrote a basic Hash Table template for class, but I have a project due that requires ~160,000 strings to be sorted into a table with at least 500 buckets (I am wanting to do more for speed).

I just have no idea where to look to get concise, easily digestible information on this.

Any help would be greatly appreciated.

有帮助吗?

解决方案

I suggest a universal hash function. This kind of function guarantees a small number of collisions in expectation, even if the data is chosen by an adversary. There are plenty of universal hash functions.

In case of strings, you can adopt the following hash function.

For a character c we define #(c) the arithmetic value of c ie(ASCII). For a string x=c1c1...cn we define enter image description hereenter image description here

If HSize is an integer and k a big prime number (you define it), for a range 0<a,b<k*HSizelet the hash function be:

enter image description here

This function provides output between [0, HSize-1]

The output value is calculated by horner's rule, finding the modulo of the k*HSize division after every operation.

So, create a function HashFunction and pass the string you want to hash as a parameter. Here is the code:

#define k 7919 
#define Hsize 1009   
#define a 321
#define b 43112

long long HashFunction(string text)
{
  int i;
  long long  res = 0;
  long long M = (Hsize * k);
  cout<<"M = "<<M<<endl;
  cout<<"Hsize = "<<Hsize<<endl;
  cout<<"k = "<<k<<endl;
  int s=text.size();
  for(i = s-1; i >= 0; i--)
  {
    res = a * (res * 256 + (int)text[i]);
    //cout<<"res before modulo = "<<res<<endl;
    res=res % M;
    //cout<<"res after modulo = "<<res<<endl;
  }
    long long res1 = (res + b) / k;
    return res1;
}
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top