Question

I need to know if exists any form to get a unique hash from gif images, i did tried with SHA1 file function

sha1_file

but i don't know if exist the case where two hash of different gif images, result in same hash. Its can happen with SHA1? In this case is better SHA2, or MD5? Or any other previously implemented in PHP language.

I know its also depends of file size, but gifs image don't exceed 10mb in any case.

I need recommendations for this problem. best regards.

Was it helpful?

Solution

There is no hash function that creates different values for each and every set of images you provide. This should be obvious as your hash values are much shorter than the files themselves and therefore they are bound to drop some information on the way. Given a fixed set of images it is rather simple to produce a perfect hash function (e.g. by numbering them), but this is probably not the answer you are looking for.

On the other hand you can use "perfect hashing", a two step hashing algorithm that guarantees amortized O(1) access using a two step hashing algorithm, but as you are asking for a unique 'hash' that may also not be what you are looking for. Could you be a bit more specific about why you insist on the hash-value being unique and under what circumstances?

OTHER TIPS

sha1_file is fine.

In theory you can run into two files that hash to the same value, but in practice it is so stupendously unlikely that you should not worry about it.

Hash functions don't provide any guarantees about uniqueness. Patru explains why, very well - this is the pigeonhole principle, if you'd like to read up.

I'd like to talk about another aspect, though. While you won't get any theoretical guarantees, you get a practical guarantee. Consider this: SHA-256 generates hashes that are 256 bits long. That means there are 2256 possible hashes it can generate. Assume further that the hashes it generates are distributed almost purely randomly (true for SHA-256). That means that if you generate a billion hashes a second, 24 hours a day, you'll have generated 31,536,000,000,000,000 hashes a year. A lot, right?

Divide that by 2256. That's ~1060. If you walked linearly through all possible hashes, that's how many years it would take you to generate all possible hashes (pack a lunch). Divide that by two, that's... still ~1060. That's how many years you'd have to work to have a greater than 50% chance of generating the same hash twice.

To put it another way, if you generate a billion hashes a second for a century, you'd have a 1/1058 chance of generating the same hash twice. Until the sun burns out, 1/1050.

Those are damn fine chances.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top