Question

Would you advise please, how to calculate file HASH on files larger than 2GB in PHP?

The only PHP function known to me is:

string hash_file ( string $algo , string $filename [, bool $raw_output = false ] )

This function however has a limitation. It returns HASH for files smaller than 2GB. For larger files, hash_file() throws error.

Here are some constraints/requests:

  • should work on Linux Ubuntu 64bit server
  • compatible with PHP 5+
  • there should be no file size limit
  • should be as fast as possible

This is all the information I have now. Thank you very much.


UPDATE

I have a solution that is more practical and efficient than any hash calculation from data >2GB.

I have realized, that I do not have to generate hash from complete files that are over 2GB. To uniquely identify any file, calculating hash from say first 10KB of data of any file should be sufficient. Moreover, it will be faster than >2GB calculation. In other words, ability to calculate hash from a data string that is over 2GB probably is not necessary at all.

I will wait for your reactions. In couple of days, I will close this question.

Was it helpful?

Solution

I would use exec() to run a local hashing function in the shell and return the value back to the php script. Here's an example with md5 but any algo available can be used.

  $results = array();
  $filename = '/full/path/to/file';
  exec("md5sum $filename", $results);

Then parse the result array (the output of the shell command).

In general, I like to avoid doing anything directly in PHP that requires more than 1G of memory, especially if running in php-fpm or as an apache module--sort of time reinforced prejudice. This is definitely my advice when there is a native application that can accomplish the goal and you don't particularly need portablitly cross platform (like run on both linux and windows machines).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top