Question

I need to crack a sha256 hash, and I know the answer is in coordinates, but I don't know what are the coordinate values example:

3f1c756daec9ebced7ff403acb10430659c13b328c676c4510773dc315784e4e
58.375782 26.742632

Is it possible to create a python script that makes two variables (both with the value 00.000000), then add them togheter (ex: k=i+" "+j), then converts k into sha256 and compares it to the sha256, I'm trying to crack. If it doesn't equal the sha256 being cracked, then it adds i a value (i=i+00.000001) and triess again. and so on and so on

Was it helpful?

Solution

Producing all possible coordinates between 00.000000 and 99.999999 is easy enough:

from itertools import product
import hashlib

digits = '0123456789'

for combo in product(digits, repeat=16):
    coords = '{}.{} {}.{}'.format(
        ''.join(combo[:2]), ''.join(combo[2:8]),
        ''.join(combo[8:10]), ''.join(combo[10:]))
    hash = hashlib.sha256(coords).hexdigest()
    if hash == '3f1c756daec9ebced7ff403acb10430659c13b328c676c4510773dc315784e4e':
        print coords
        break

This'll brute-force all 10**16 (a big number) combinations. Sit back and relax, this'll take a while.

And by 'a while', we really mean not in your lifetime, or anyone else's. Just iterating over all possible combinations produced by product() takes a huge amount of time, as each added digit to try increases the time required by a factor of 10:

>>> from collections import deque
>>> from itertools import product
>>> from timeit import timeit
>>> digits = '0123456789'
>>> timeit(lambda: deque(product(digits, repeat=8), 0), number=5)
3.014396679995116
>>> timeit(lambda: deque(product(digits, repeat=9), 0), number=5)
30.99540744899423

If producing all possible combinations of 8 digits takes .8 seconds (4s divided by 5 repetitions), 9 digits takes 8 seconds, you can extrapolate from that that 10 digits takes almost 1.5 minutes, etc. Just producing all possible combinations of 16 digits takes 1 million (10 ** 6) times as much time as 10 digits, so 963 days or just onder 3 years to run those in a loop. You could split this task up across 2000 different processes on a large number of computers with enough cores in total to run those processes in parallel, to reduce that to under 12 hours.

Then the loop body itself takes about 2.4 seconds per million iterations:

>>> from random import choice
>>> combo = tuple(choice(digits) for _ in range(16))  # random combination for testing
>>> timeit("""\
... coords = '{}.{} {}.{}'.format(
...     ''.join(combo[:2]), ''.join(combo[2:8]),
...     ''.join(combo[8:10]), ''.join(combo[10:]))
... hash = hashlib.sha256(coords).hexdigest()
... if hash == '3f1c756daec9ebced7ff403acb10430659c13b328c676c4510773dc315784e4e': pass
... """, 'from __main__ import combo; import hashlib')
2.3429908752441406

But you have 10 ** 10 (10 thousand million) more times work than that, totaling roughly 743 years of computation work. Even being able to run 20 thousand parallel processes won't reduce that to a reasonable number (that's still about 13.5 years of work).

Python is just not fast enough for this task. Using GPUs it should be possible to reach 500 million hashes per second (0.5 Gigahash / s), at which point you could run the above brute-force operation and find a solution in about 230 days on such a system. At a cost, of course, because such a rig would cost about $3000-$4000 a month to run! But with enough dedicated hardware you can certainly 'crack' the hash in 'humane' timeline.

OTHER TIPS

One of the common claims about hashes is that they discard information, therefore they cannot be reversed. There's infinite messages that have the same hash. You can't know which of the infinite messages that give the same hash is correct.

Of course in practice, a brute force attack often works - either because your search strategy is likely to find the real original message first (most messages with hash collisions are obviously wrong in some trivial way - e.g. the wrong format - and won't turn up in the search because of that) or because your attack needs a different message with the same hash anyway.

In your case, what you know about the message means there's less information in the message than (apparently) in the hash. Of course hashing doesn't create new information, so that means many hashes cannot occur for any co-ordinate strings. You have (with very high probability for good hash algorithms) a 1:1 relationship between possible hashes and possible messages. In principle, you have an encrypted form of your message which can be decrypted.

Many people would call me an idiot for saying this, of course. After all, you still have to find all the hashes for all the possible messages. That may be faster than some people think, but it's a long way from trivial.

It has already been pointed out that there are 10^16 possible combinations based on your co-ordinate format. One thing to check is whether all values for all those digits are possible (and equally probable). Using floating point arithmetic internally shouldn't be a problem - double precision floats aren't 8-digit decimals, but 53 bits of mantissa should be plenty to ensure all those decimal digits are fully in use. However, it may be worth checking that there's no other limitation that reduces the number of cases to check - the obvious one being the precision in how those co-ordinates are measured.

Even if certain digit values are less probable than others, that means ordering the search to check more likely values first will save a lot of time for crackers.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top