Question

I'm developing a website and i'm handling the users uploads like this:

there is a unique key for each user, lets say a user has the key "aaabbbccc". Right now im saving this users uploads to a directory-tree like this aaa/bbb/ccc/<timestamp>.fileextension

When i want to see what uploads a user has, i'm doing this in the users corresponded directory

if (is_dir($dir))
{
    if ($dh = opendir($dir))
    {
        $inc = 0;
        while (($file = readdir($dh)) !== false)
        {
            if (!is_dir($dir.$file))
            {
                $user_files[$inc]=$file;
                $inc++;
            }
        }
        closedir($dh);
    }
}

my concern is: Is it efficient to get a users list of uploads like this or it would be faster if i stored each upload filename in a database like this:

upload_id, upload_filename, user_owner_id

(having index at user_owner_id) then just SELECT * FROM uploads WHERE user_owner_id = 1

Main question: What would be faster?

Also in the case of storing user uploads filenames in directory structure do i have to worry about lots of reading requests to disk (i heard SSD's have short life) -i doubt this concern has any practical impact but i would love an answer to this one just for educational purposes :)

Although i shouldn't care a lot about all this since the website is not high traffic, i m very curious about your answers since i dont have any idea about which of these two is considered better practice against the other :)

Was it helpful?

Solution

Some considerations (I'm no expert on SSD's and DB's, I just know how to utilise them):

First:

  1. Faster is subjective, so ymmv, so what's faster is what's fastest for your situation. Benchmarking is always a good idea, as no one can tell you if a solution will be better for you until you give it a go.
  2. DB's are great at finding and sorting large datasets fast and efficiently
  3. Maintainability and flexibility of code should be a consideration. If using a DB will add extra goodies, then use it.
  4. You may want to hide the uploaded files and obfuscate the filename for security, and serve files only to logged in users, you'll need a db then as it makes it easier to implement.
  5. If you have only a few hundred files per directory with low site traffic then a db might be overkill. Also, if you have tens of thousands of files, having them in one directory is slower then separating them into multiple directories.
  6. Using a DB will mean you'll need to synchronize the filenames to the db, whereas using just the filesystem you wouldn't need to concern yourself with this.

Second:

  1. SSD's are made from various technologies eg non-volatile NAND flash memory or DRAM (requires constant power source hence it's volatile).
  2. With the slower flash based SSD's you also have MLC or the longer lasting SLC.
  3. Generally SSD lifetime is about write life (actually the erase limit of the cells), which is around 10 000 operations (or 5 000 depending on firmware version).
  4. Once an SSD reaches the end of it's life (0% health), it'll stay read-only, but the hardware controller or the software will fail long before the cells loses its data.
  5. If your data is important then RAID your drives. Swapping drives out before they fail will help keep peak performance in check. It's recommended to swap out when health is around 10-15%

You'll need to weight your options. With a small amount of files and low traffic a DB can be slower or just adds complexity.

With tens of thousands of files, security, plus more traffic blah blah blah, a DB would be indispensable IMO.

Hopefully that helps :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top