Question

I was inspired by this unique id code to generate a random 64 bit identifier.

My question: will this be good enough for about 10 million entries?

def self.generateId
  (0..15).collect{(rand*16).to_i.to_s(16)}.join
end
Was it helpful?

Solution

This is classic birthday problem.

With m=10^7 and n=10^20 (Since 2^64 ~ 10^20), and the collision probability is given by:

p = 1 - exp(-m^2/(2*n))

Gives a collision probability of 5e-07

I would say sampling without replacement is your best option.

OTHER TIPS

I would make it 128 bit long, that way you don't have to worry for sure about 10M records

2^64 is about EDIT: 10^31 10^21, which is larger than 10^7 (10 million) by a factor of 10^14. So it is nearly completely safe to use only 64 bits.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top