Pergunta

I wonder if it is 'safe' to hash a bunch of MD5 hash values together to create a new hash or whether this will in any way increase the probability of collisions.

The background: I have a couple of files with dependencies. Each file has an associated hash value which is calculated based on it's content. Let's call this the 'single-file' hash value. In addition to this, the file should also have a hash value which includes all the dependent files, the 'multi-file' hash value.

So the question is: Can I just take all the single-file MD5 hash values of the dependent files, concatenate them and then calculate an MD5 over the concatenated values to get the multi-file hash value. Or will this result in an MD5 hash that is more likely to collide than if I would concatenate the content of all dependent files together.

Alternatively, could I xor the single-file hash values together to generate a multi-file hash value, or would this likely result in more collisions?

Foi útil?

Solução

Sounds like you need a Merkel Tree

Outras dicas

MD5 has a lot of collision problems, see MD5 entry on Wikipedia.

However, if you use MD5 not for security but as a unique marker to check dependencies, even hashing contatenated hashes should be pretty safe.

Or, if it's not too late, switch to SHA-1.

I think the risks of a collision is about the same for hashing the concatenated files, as to hashing the concatenated file hashes.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top