Question

I'm trying to allow users to upload files through a PHP website. Since all the files are saved in a single folder on the server, it's conceivable (though admittedly with low probability) that two distinct users could upload two files that, while different, are named exactly the same. Or perhaps they're exactly the same file.

In the both cases, I'd like to use exec("openssl md5 " . $file['upload']['tmp_name']) to determine the MD5 hash of the file immediately after it is uploaded. Then I'll check the database for any identical MD5 hash and, if found, I simply won't complete the upload.

However, in the move_uploaded_file documentation, I found this comment:

Warning: If you save a md5_file hash in a database to keep record of uploaded files, which is usefull to prevent users from uploading the same file twice, be aware that after using move_uploaded_file the md5_file hash changes! And you are unable to find the corresponding hash and delete it in the database, when a file is deleted.

Is this really the case? Does the MD5 hash of a file in the tmp directory change after moving it to a permanent location? I don't understand why it would. And regardless, is there another, better way of ensuring the same file is not uploaded to the filesystem multiple times?

Was it helpful?

Solution

If you're convinced by all the reasons given here in the answers and decide not to use md5 at all (I'm still not sure whether you WANT to or MUST use hash), you can just append something unique for each user and the time of uploading to each file name. That way you'll end up with more readable file names. Something like: $filename = "$filename-$user_ip_string-$microtime";. Of course, you must have all three variables ready and formatted before that, it goes without saying.

No chance of the same file name, same IP address and same microtime occuring at the same time, right? You could easily get away with microtime only, but IP will make it even more certain. Of course, like I said, all this goes if you decide not to use hashing and go for a simpler solution.

OTHER TIPS

Shouldn't you use exec("openssl md5 " . $file['upload']['name']) name instead? I'm thinking that the temporary name differs from upload to upload.

It would seem that it indeed is the case. I have shortly been looking through the docs aswell. But why dont you share the md5 checksum before using move_uploaded_file and store that value in your database linking it directly with the new file? That was you can always check the uploaded file and whether that file already exists in your filesystem.

This does require a database, but most have access to one.

Try renaming the uploaded file to a unique id. Use this:

$dest_filename = $filename;
        if (RENAME_FILE) {
      $dest_filename = md5(uniqid(rand(), true)) . '.' . $file_ext;
         }

Let me know if it helps :)

No, in general the hash doesn't change by move_uploaded_file somehow magically.

But, if you compute the md5() including the file's path, the hash will certainly change if the file is move to a new path/folder.

In case you md5() the filename, nothing will change.

It's a good idea to rename uploaded files with a unique name.

But don't forget to locate the file to finally store the file, is outside of your document root folder of your vHost. Located there, it can't be downloaded without using a PHP-script.

Final remark: While it's very very unlikely, md5 hashed of two different files may be identical.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top