Frage

This is theoretical question but I am curious about it. What if I do this (code in PHP, but the language isn't really matter in this case):

$value  = ''; //starting value
$repeat = false;

while(true)
{

     $value = md5($value);

     /*Save values in database, one row per value*/

     /*Check for repeated hash value in db, and set $repeat flag true if there is one*/

     if($repeat)break;    
}

As you can see I suspect that there will be repeated hash values. I think there is no way that every existing text has its own value as it should mean that every hash value has its own and that doesn't make sense.

My questions are: Is there any article about this "problem" out there? It can happen I got the same value in one system for example when I hash files for check if they are valid? Can this caused problems anywhere in any system?

War es hilfreich?

Lösung

If you care about multiple texts hashing to the same value, don't use MD5. MD5 has fast collision attacks, which violated the property you want. Use SHA-2 instead.

When using a secure hash function, collisions for 128 hashes are extremely difficult to find, and by that I mean that I know of no case where it happened. But if you want to avoid that chance, simply use 256 bit hashes. Then finding a collision using brute-force is beyond the computational power of all humanity for now. In particular there is no known message pair for which SHA-256(m1) == SHA-256(m2) with m1 != m2.

You're right that hashed can't be unique(See Pidgeonhole principle), but the chances of you actually finding such a case are extremely low. So don't bother with handling that case.

I typically aim for a 128 bit security level, so when I need a collision free hash function, I use a 256 bit hash function, such as SHA-256.


With your hash chain you won't find a collision, unless you're willing to wait for a long time. Collisions become likely once you have around 2^(n/2) times, which is 2^64 in the case of 128 bit hashes such as md5. I know of no brute-force collisions against a 128 bit hash. The only collisions I know are carefully crafted messages that exploit weaknesses in the hashing scheme you use (those exist against md5).

Andere Tipps

Hash it multiple times by same method or different method, Then it would be nearly impossible to repeat its self, Also check if they repeat then repeat the hash function until the values are different, Then save in database or use it where ever you like...

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top