What's the difference between bcrypt and hashing multiple times?

https://stackoverflow.com/questions/7072478

16-12-2020
|

Question

How is bcrypt stronger than, say,

def md5lots(password, salt, rounds):
    if (rounds < 1)
        return password
    else
        newpass = md5(password + salt)
        return md5lots(newpass, salt, rounds-1)

I get the feeling, given its hype, that more intelligent people than me have figured out that bcrypt is better than this. Could someone explain the difference in 'smart layman' terms?

Solution

There are three significant differences between bcrypt and hashing multiple times with MD5:

The size of the output: 128-bit (16-bytes) for MD5 and 448 bits (56-bytes) for bcrypt. If you store millions of hashes in a database, this has to be taken into account.
Collisions and preimage attacks are possible against MD5.
Bcrypt can be configured to iterate more and more as cpu's become more and more powerful.

Hence, using salting-and-stretching with MD5 is not as safe as using bcrypt. This issue can be solved by selecting a better hash function than MD5.

For example, if SHA-256 is selected, the output size will be 256-bits (32-bytes). If the salting-and-stretching can be configured to increase the number of iterations like bcrypt, then there is no difference between both methods, except the amount of space required to store result hashes.

OTHER TIPS

The principal difference - MD5 and other hash functions designed to verify data have been designed to be fast, and bcrypt() has been designed to be slow.

When you are verifying data, you want the speed, because you want to verify the data as fast as possible.

When you are trying to protect credentials, the speed works against you. An attacker with a copy of a password hash will be able to execute many more brute force attacks because MD5 and SHA1, etc, are cheap to execute.

bcrypt in contrast is deliberately expensive. This matters little when there are one or two tries to authenticate by the genuine user, but is much more costly to brute-force.

You are effectively talking about implementing PBKDF2 or Password-Based Key Derivation Function. Effectively it is the same thing as BCrypt, the advantage being that you can lengthen the amount of CPU time it takes to derive a password. The advantage of this over something like BCrypt is that, by knowing how many 'Iterations' you have put the password through, when you need to increase it you could do it without resetting all the passwords in the database. Just have your algorithm pick up the end result as if it were at the nth iteration (where n is the previous itteration count) and keep going!

It is recomended you use a proper PBKDF2 library instead of creating your own, because lets face it, as with all cryptography, the only way you know if something is safe is if it has been 'tested' by the interwebs. (see here)

Systems that use this method:
.NET has a library already implemented. See it here
Mac, linux and windows file encryption uses many itteration (10,000+) versions of this encryption method to secure their file systems.
Wi-Fi networks are often secured using this method of encryption
Source

Thanks for asking the question, it forced me to research the method i was using for securing my passwords.

TTD

Although this question is already answered, i would like to point out a subtle difference between BCrypt and your hashing-loop. I will ignore the deprecated MD5 algorithm and the exponential cost factor, because you could easily improve this in your question.

You are calculating a hash-value and then you use the result to calculate the next hash-value. If you look at the implementation of BCrypt, you can see, that each iteration uses the resulting hash-value, as well as the original password (key).

Eksblowfish(cost, salt, key)
  state = InitState()
  state = ExpandKey(state, salt, key)
  repeat (2^cost)
    state = ExpandKey(state, 0, key)
    state = ExpandKey(state, 0, salt)
  return state

This is the reason, you cannot take a Bcrypt-hashed password and continue with iterating, because you would have to know the original password then. I cannot prove it, but i suppose this makes Bcrypt safer than a simple hashing-loop.

Strictly speaking, bcrypt actually encrypts the text:

OrpheanBeholderScryDoubt

64 times.

But it does it with a key that was derived from your password and some randomly generated salt.

Password hashing is not hashing

The real virtue of "password hashing algorithms" (like bcrypt) is that they use a lot of RAM.

SHA2 is designed to be fast. If you're a real-time web-server, and you want to validate file integrity, you want something that runs extraordinarly fast, with extraordinarliy low resource usage. That is the antithesis of password hashing.

SHA2 is designed to be fast
SHA2 can operate with 128 bytes of RAM
SHA2 is easily implementable in hardware
i own a USB stick device that can calculate 330 million hashes per second
in fact, i own 17 of them

If you perform a "fast" hash multiple times (e.g. 10,000 is a common recommendation of PBDKF2), then you're not really adding any security.

What you need is a hash that is difficult to implement in hardware. What you need is a hash that is hard to parallelize on a GPU.

Over the last few decades we've learned that RAM is the key to slowing down password hashing attempts. Custom hardware shines at performing raw computation (in fact, only 1% of your CPU is dedicated to computation - the rest is dedicated to jitting the machine instructions into something faster; pre-fetching, out-of-order-execution, branch prediction, cache). The way to styme custom hardware is to make the algorithm have to touch a lot of RAM.

SHA2: 128 bytes
bcrypt: 4 KB
scrypt (configurable): 16 MB in LiteCoin
Argon2 (configurable): 64 MB in documentation examples

Password hashing does not mean simply using a fast hash multiple times.

A modern recommended bcrypt cost factor is 12; so that it takes about 250 ms to compute.
you would have to perform about 330,000 iterations of SHA2 to equal that time cost on a modern single-core CPU

But then we get back to my 2.5W, USB, SHA2 stick and it's 330 Mhashes/sec. In order to defend against that, it would have to be 83M iterations.

If you're try to add only CPU cost: you're losing.
You have to add memory cost

bcrypt is 21 years old, and it only uses 4KB. But it is still ~infinitely better than any amount of MD5, SHA-1, or SHA2 hashing.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow