Question

I need to generate global unique ids by hashing some data.

On the one hand, I could use a combination of timestamp and network address, which is unique since every computer can only create one id at the same time. But since this data is to long I'd need to hash it and thus collisions could occur. (As a side note, we could also throw in a random number if the timestamp is not exact enough.)

On the other hand, I could just use a random number and hash that. Shouldn't that bring exactly the same hash collision probability as the first approach? It is interesting because this approach would be faster and is much easier to implement.

Is there a difference in terms of hash collisions when using unique data rather than random data? (By the way, I will not use real GUIDs as described by the standard but mine will only be 64 bits long. But that shouldn't affect the question.)

Was it helpful?

Solution

Why bother to hash a random number? Hashing is designed to map inputs uniformly to a keyspace, but PRNGs are already giving you a uniform mapping of outcomes. All you're doing is creating more work.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top