Question

I want to partition my users into several groups to run an A/B test.

The usual approach is to randomly assign each user to a variant and store the relation until the end of the A/B test. But that will force me to store that association somewhere and I want to avoid it.

Since the user are already registered in my application I would like to have a function that uniformly distributes the users across my tests so I can get non skewed results in my A/B test.

Which kind of hash function should I use?

Was it helpful?

Solution

This ACM's paper explains that md5 is a good hashing function to get a both an uniform distribution and no correlations between experiments:

We found that only the cryptographic hash function MD5 generated no correlations between experiments. SHA256 (another cryptographic hash) came close, requiring a five-way interaction to produce a correlation. The .NET string hashing function failed to pass even a two-way interaction test.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top