Should I use a ramdisk for pictures that are converted and removed?

https://stackoverflow.com/questions/8298442

10-03-2021
|

Question

I have a little program here (python 2.7) that runs on an old machine and it basically keeps getting pictures (for timelapses) by running an external binary and converts them to an efficient format to save up disk space.

I want to minimize the disk operations, because it's already pretty old and I want it to last some more time.

At the moment the program writes the data from the camera on the disk, then converts it and removes the original data. However it does that for every image, 1- it writes a large file on disk, 2- reads it to convert, 3- and then deletes it... a bunch of disc operations that aren't necessary and could be done in ram, because the original file doesn't have to be stored and is only used as a basis to create another one.

I was sure a ramdisk was the solution, then I googled on how to do that, and google returned me a bunch of links that discourage the use of ramdisk, the reasons are many: because they are not useful in modern systems (i'm running a pretty new linux kernel); they should only be used if you want to decrypt data that shouldn't hit the disk; some tests shows that ramdisk could be actually slower than hd; the operating system has a cache...

So I'm confused...

In this situation, should I use a ramdisk?

Thank you.

PS: If you want more info: I have a proprietary high-res camera, and a proprietary binary that I run to capture a single image, I can specify where it will write the file, which is a huge TIFF file, and then the python program runs the convert program from imagemagick to convert it to JPEG and then compress it in tar.bz2, so the quality is almost the same but the filesize is 1/50 of the TIFF.

Solution

My experience with ramdisks is congruent with what you've mentioned here. I lost performance when I moved to them because there was less memory available for the kernel to do it's caching intelligently and that messed things up.

However, from your question, I understand that you want to optimise for number of disk operations rather than speed in which case a RAM disk might make sense. As with most of these kinds of problems, monitoring is the right way to do it.

Another thing that struck me was that if your original image is not that big, you might want to buy a cheap USB stick and do the I/O on that rather than on your main drive. Is that not an option?

OTHER TIPS

Ah, proprietary binaries that only give certain options. Yay. The simplest solution would be adding a solid state hard drive. You will still be saving to disk, but disk IO will be much higher for reading and writing.

A better solution would be outputting the tiff to stdout, perhaps in a different format, and piping it to your python program. It would never hit the hard drive at all, but it would be more work. Of course, if the binary doesn't allow you to do this, then it's moot.

If on Debian (and possibly its derivatives), use "/run/shm" directory.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow