I suggest a universal hash function. This kind of function guarantees a small number of collisions in expectation, even if the data is chosen by an adversary. There are plenty of universal hash functions.
In case of strings, you can adopt the following hash function.
For a character c we define #(c) the arithmetic value of c ie(ASCII). For a string x=c1c1...cn
we define
If HSize is an integer and k a big prime number (you define it), for a range 0<a,b<k*HSize
let the hash function be:
This function provides output between [0, HSize-1]
The output value is calculated by horner's rule, finding the modulo of the k*HSize
division after every operation.
So, create a function HashFunction and pass the string you want to hash as a parameter. Here is the code:
#define k 7919
#define Hsize 1009
#define a 321
#define b 43112
long long HashFunction(string text)
{
int i;
long long res = 0;
long long M = (Hsize * k);
cout<<"M = "<<M<<endl;
cout<<"Hsize = "<<Hsize<<endl;
cout<<"k = "<<k<<endl;
int s=text.size();
for(i = s-1; i >= 0; i--)
{
res = a * (res * 256 + (int)text[i]);
//cout<<"res before modulo = "<<res<<endl;
res=res % M;
//cout<<"res after modulo = "<<res<<endl;
}
long long res1 = (res + b) / k;
return res1;
}