Question

I have an application which receives several hundred strings per second, about fifty bytes long, over the network. I want to cache these to an SSD for further processing. Is it safe for the SSD if I perform several hundred ~50 byte file-append writes per second? I suspect the OS might aggregate these writes, but I don't know. I do vaguely understand SSDs are made out of cells which have to be updated on an all-or-none basis, and also that each cell can withstand only so many writes (hundreds or thousands, I think, for commodity drives). Does my many-small-writes application put my SSD on the way to destruction? Would I be safer if I cached to memory and wrote to disk in chunks every minute/hour? Obviously this is more complicated (web service to the memory cache for the most recent minute/hour, disk for older data), but I also prefer not to destroy too much hardware.

I've done something not too different from this for months-not-years without adverse effects, but that was years ago, and I haven't done serious testing. I do know from experience I can destroy an HDD in a matter of months quite reliably by this method if implemented naively.

Was it helpful?

Solution

SSDs themselves do write combining. They cache writes in the onboard cache until they have a big block and then they write the block in parallel to many NAND dies. This is how SSDs achieve such high write speed despite the fact that the write speed on the NAND itself is quite low. As long as the writes are more or less sequential the size is not very important (not until the number of write requests saturates the ability of the controller to handle).

Depending on the rate at which you need to write the strings, it may make sense to gather them up into blocks at least as big as the page size of the NAND in the SSD. Nowadays the page size is usually 8KB.

OTHER TIPS

I believe that modern SSDs have enough smarts to distribute writes to less-used cells, rather than continually write to the same area of storage (to help combat burnout).

That said, I would probably cache to memory until you've reached a certain threshold size, and then dump everything out to the SSD. But that would be something I'd do for performance reasons...

If you think about it, what's the theoretical difference between writing 50 bytes 500 times vs. 500 bytes 50 times? The same number of bytes are still ultimately being written to the drive, it's just broken up differently. You'll still end up writing to the same number of cells on your drive either way, it seems like more a matter of if the drive controller can keep up with the demand.

I advise the reader to control SMART values of his/her SSD when writing small portions of data into database in separate commits.

Personally I started a project where I had two daemons writing packets of data into MySQL Innodb. Packet's size was about 100 bytes. One daemon wrote one packet each 15 seconds, the other one packet in 1.5 seconds. Each packet was a separate commit and I got HUGE, unbelievably huge impact on my SSD.

After starting the project the "average block erase" counter began to increase by 1 each day. As I understand it means that the whole SSD (which is 70% empty) was being rewritten each day to sustain two Innodb tables 60MB each. I described this situation in little more (unnecessary) details in my blog.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top