Do similar passwords have similar hashes?

https://stackoverflow.com/questions/2683626

30-09-2019
|

Question

Our computer system at work requires users to change their password every few weeks, and you cannot have the same password as you had previously. It remembers something like 20 of your last passwords. I discovered most people simply increment a digit at the end of their password, so "thisismypassword1" becomes "thisismypassword2" then 3, 4, 5 etc.

Since all of these passwords are stored somewhere, I wondered if there was any weakness in the hashes themselves, for standard hashing algorithms used to store passwords like MD5. Could a hacker increase their chances of brute-forcing the password if they have a list of hashes of similar passwords?

Solution

Do similar passwords have similar hashes?

No.

Any similarity, even a complex correlation, would be considered a weakness in the hash. Once discovered by the crypto community it would be published, and enough discovered weaknesses in the hash eventually add up to advice not to use that hash any more.

Of course there's no way to know whether a hash has undiscovered weaknesses, or weaknesses known to an attacker but not published, in which case most likely the attacker is a well-funded government organization. The NSA certainly is in possession of non-public theoretical attacks on some crypto components, but whether those attacks are usable is another matter. GCHQ probably is. I'd guess that a few other countries have secret crypto programs with enough mathematicians to have done original work: China would be my first guess. All you can do is act on the best available information. And if the best available information says that a hash is "good for crypto", then one of the things that means is no usable similarities of this kind.

Finally, some systems use weak hashes for passwords -- either due to ignorance by the implementer or legacy. All bets are off for the properties of a hashing scheme that either hasn't had public review, or else has been reviewed and found wanting, or else is old enough that significant weaknesses have eventually been found. MD5 is broken for some purposes (since there exist practical means to generate collisions) but not for all purposes. AFAIK it's OK for this, in the sense that there is no practical pre-image attack, and having a handful of hashes of related plaintexts is no better than having a handful of hashes of unrelated plaintexts. But for unrelated reasons you shouldn't really use a single application of any hash for password storage anyway, you should use multiple rounds.

Could a hacker increase their chances of brute-forcing the password if they have a list of hashes of similar passwords?

Indirectly, yes, knowing that those are your old passwords. Not because of any property of the hash, but suppose the attacker manages to (very slowly) brute-force one or more of your old passwords using those old hashes, and sees that in the past it has been "thisismypassword3" and "thisismypassword4".

Your password has since changed, to "thisismypassword5". Well done, by changing it before the attacker cracked it, you have successfully ensured that the attacker did not recover a valuable password! Victory! Except it does you no good, since the attacker has the means to guess the new one quickly anyway using the old password(s).

Even if the attacker only has one old password, and therefore cannot easily spot a trend, password crackers work by trying passwords which are similar to dictionary words and other values. To over-simplify a bit, it will try the dictionary words first, then strings consisting of a word with one extra character added, removed or changed, then strings with two changes, and so on.

By including your old password in the "other values", the attacker can ensure that strings very similar to it are checked early in the cracking process. So if your new password is similar to old ones, then having the old hashes does have some value to the attacker - reversing any one of them gives him a good seed to crack your current password.

So, incrementing your password regularly doesn't add much. Changing your password to something that's guessable from the old password puts your attacker in the same position as they'd be in if they knew nothing at all, but your password was guessable from nothing at all.

The main practical attacks on password systems these days are eavesdropping (via keyloggers and other malware) and phishing. Trying to reverse password hashes isn't a good percentage attack, although if an attacker has somehow got hold of an /etc/passwd file or equivalent, they will break some weak passwords that way on the average system.

OTHER TIPS

With a good hash algorithm, similar passwords will get distributed across the hashes. So similar passwords will have very different hashes.

You can try this with MD5 and different strings.

"hello world" - 5eb63bbbe01eeed093cb22bb8f5acdc3
"hello  world" - fd27fbb9872ba413320c606fdfb98db1

It depends on the hashing algorithm. If it is any good, similar passwords should not have similar hashes.

The whole point of a cryptographic hash is that similar passwords would absolutely not create similar hashes.

More importantly, you would most likely salt the password so that even the same passwords do not produce the same hash.

It depends on the hash algorithm used. A good one will distribute similiar inputs to disparate outputs.

Different Inputs may result in the same Hash this is what is called a hash collision.

Check here:

http://en.wikipedia.org/wiki/Collision_%28computer_science%29

Hash colisions may be used to increase chances of a successfull brute force attack, see:

http://en.wikipedia.org/wiki/Birthday_attack

To expand on what others have said, a quick test shows that you get vastly different hashes with small changes made to the input.

I used the following code to run a quick test:

<?php
for($i=0;$i<5;$i++)
        echo 'password' . $i . ' - ' .md5('password' . $i) . "<br />\n";
?>

and I got the following results:

password0 - 305e4f55ce823e111a46a9d500bcb86c
password1 - 7c6a180b36896a0a8c02787eeafb0e4c
password2 - 6cb75f652a9b52798eb6cf2201057c73
password3 - 819b0643d6b89dc9b579fdfc9094f28e
password4 - 34cc93ece0ba9e3f6f235d4af979b16c

Short answer, no!

The output of a hash function varies greatly even if one character is increased.

But this is only if you want to break the hashfunction itself.

Of course, it is bad practice since it makes bruteforcing easier.

No, if you check the password even slightly it produces completely new hash.

As a general rule, a "good hash" will not hash two similar (but unequal) strings to similar hashes. MD5 is good enough that this isn't a problem. However, there are "rainbow tables" (essentially password:hash pairs) for quite a few common passwords (and for some password hashes, the traditional DES-based unix passwords, for example) full rainbow tables exist.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow