Question

I have a requirement to hash input strings and produce 14 digit decimal numbers as output.

The math I am using tells me I can have, at maximum, a 46 bit unsigned integer.

I am aware that a 46 bit uint means less collision resistance for any potential hash function. However, the number of hashes I am creating keeps the collision probability in an acceptable range.

I would be most grateful if the community could help me verify that my method for truncating a hash to 46 bits is solid. I have a gut feeling that there are optimizations and/or easier ways to do this. My function is as follows (where bitLength is 46 when this function is called):

    public static UInt64 GetTruncatedMd5Hash(string input, int bitLength)
    {
        var md5Hash = MD5.Create();

        byte[] fullHashBytes = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));

        var fullHashBits = new BitArray(fullHashBytes);

        // BitArray stores LSB of each byte in lowest indexes, so reversing...
        ReverseBitArray(fullHashBits);

        // truncate by copying only number of bits specified by bitLength param
        var truncatedHashBits = new BitArray(bitLength);
        for (int i = 0; i < bitLength - 1; i++)
        {
            truncatedHashBits[i] = fullHashBits[i];
        }

        byte[] truncatedHashBytes = new byte[8];

        truncatedHashBits.CopyTo(truncatedHashBytes, 0);

        return BitConverter.ToUInt64(truncatedHashBytes, 0);
    }

Thanks for taking a look at this question. I appreciate any feedback!

Était-ce utile?

La solution

With the help of the comments above, I crafted the following solution:

 public static UInt64 GetTruncatedMd5Hash(string input, int bitLength)
 {
        if (string.IsNullOrWhiteSpace(input)) throw new ArgumentException("input must not be null or whitespace");

        if(bitLength > 64) throw new ArgumentException("bitLength must be <= 64");

        var md5Hash = MD5.Create();

        byte[] fullHashBytes = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));

        if(bitLength == 64)
            return BitConverter.ToUInt64(fullHashBytes, 0);

        var bitMask = (1UL << bitLength) - 1UL;

        return BitConverter.ToUInt64(fullHashBytes, 0) & bitMask;
    }

It's much tighter (and faster) than what I was trying to do before.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top