Given the input:
original md5 is ['0430f244a18146a0815aa1dd4012db46', '0430f244a18146a0815aa1dd40 12db46', '59739CCDA2F15D5AC16DB6695CAE3378']
downloader file md5 is 59739ccda2f15d5ac16db6695cae3378
You have two problems.
First of all, that first one isn't just an MD5, but an MD5 and two other things.
To fix that: If you know that origmd5
will always be in this format, just use origmd5[2]
instead of origmd5
. If you have no idea what origmd5
is, except that one of the things in it is the actual MD5, you'll have to compare against all of the elements.
Second, the actual MD5 values are both hex strings representing the same binary data, but they're different hex strings (because one is in uppercase, the other in lowercase). You could fix this by just doing a case-insensitive comparison, but it's probably more robust to unhexlify them both and compare the binary values.
In fact, if you've copied and pasted the output correctly, at least one of those hex strings has a space in the middle of it, so you actually need to unhexlify hex strings with optional spaces between hex pairs. AFAIK, there is no stdlib function that does this, but you can write it yourself in one step:
def unhexlify(s):
return binascii.unhexlify(s.replace(' ', ''))
Meanwhile, I'm not sure why you're trying to use difflib.SequenceMatcher
at all. Two slightly different MD5 hashes refer to completely different original sources; that's kind of the whole point of MD5, and crypto hash functions in general. There's no such thing as a 95% match; there's either a match, or a non-match.
So, if you know the 3rd value in origmd5
is the one you want, just do this:
s = unhexlify(origmd5[2]) == unhexlify(dlmd5)
Otherwise, do this:
s = any(unhexlify(origthingy) == unhexlify(dlmd5) for origthingy in origmd5)
Or, turning it around to make it simpler:
s = unhexlify(dlmd5) in map(unhexlify, origthingy)
Or whatever equivalent you find most readable.