Question

I want to hash a string of length up-to 30. What will be the best idea to do that if time is my concern. The function will be called over 100 million times. currently I am using the following code,

static UInt64 CalculateHash(string read, bool lowTolerance)
{
    UInt64 hashedValue = 0;
    int i = 0;
    while (i < read.Length)
    {
        hashedValue += read.ElementAt(i) * (UInt64)Math.Pow(31, i);
        if (lowTolerance) i += 2;
        else i++;
    }
    return hashedValue;
}
Was it helpful?

Solution

static UInt64 CalculateHash(string read)
{
    UInt64 hashedValue = 3074457345618258791ul;
    for(int i=0; i<read.Length; i++)
    {
        hashedValue += read[i];
        hashedValue *= 3074457345618258799ul;
    }
    return hashedValue;
}

This is a Knuth hash. You can also use Jenkins.

OTHER TIPS

First of all, consider using GetHashCode().

A simple improvement on your existing implementation:

static UInt64 CalculateHash(string read, bool lowTolerance)
{
    UInt64 hashedValue = 0;
    int i = 0;
    ulong multiplier = 1;
    while (i < read.Length)
    {
        hashedValue += read[i] * multiplier;
        multiplier *= 37;
        if (lowTolerance) i += 2;
        else i++;
    }
    return hashedValue;
}

It avoids the expensive floating point calculation, and the overhead of ElementAt.

Btw (UInt64)Math.Pow(31, i) doesn't work well for longer strings. Floating point rounding will lead to a multiplier of 0 for characters beyond 15 or so.

To speed up your implementation, the (UInt64)Math.Pow(31, i) call should be replaced by a lookup: pre-calculate a table of the first 30 powers of 31, and use it at runtime. Since the limit on length is 30, you need only 31 element:

private static unsigned long[] Pow31 = new unsigned long[31];

static HashCalc() {
    Pow31[0] = 1;
    for (int i = 1 ; i != Pow31.Length ; i++) {
        Pow31[i] = 31*Pow31[i-1];
    }
}

// In your hash function...
hashedValue += read.ElementAt(i) * Pow31[i];

I have played with Paul Hsieh's implementations, and seem to be fast with little collisions (for my scenarios anyway)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top