Speeding up writing images into hard disk in OpenCV

Question 1

Let's see: 2048*1080*3(number of channels)*50 fps ~= 316MB/s, if you were writing the images in raw. If you're using JPEG, depending on the compression parameters you may get a substantial reduction, but if it's 1/5th, you're still writing a lot of data to the harddrive, specially if you're using a 5400rpm on a laptop.

Things you could do:

As David Schwartz suggests, you should use queues and multiple threads.
If you're effectively writing an image sequence, save a video instead. The data is compressed much more and the writing to disk is faster.
Check the specks of your current device and get an estimate of the maximum size of the images that you can write to it. Choose compression parameters to fit that size constraint.

Question 2

There are multiple solutions generally possible, but you need to specify the format of your images - grayscale what? 8 bits? 12 bits? 16 bits?

Most other answers completely miss the mark by ignoring the physical reality of what you're trying to do: the bandwidth, both in terms of I/O and processing, is of primary importance.

Did you verify the storage bandwidth available on your system, in realistic conditions? It will be generally a bad idea to store this stream on the same drive your operating system lives on, because the seeks due to requirements of other applications will eat into your bandwidth. Remember that on a modern 50+Mbyte/s hard drive with 5ms seeks, one seek costs you 0.25MBytes of bandwidth, and that's rather optimistic since modern "run of the mill" hard drives read faster and seek slower, on average. I'd say 1MByte lost per each seek is a conservative estimate on yesteryear's consumer drives.

If you need to write raw frames and don't want to compress them even in a lossless fashion, then you need a storage system that can support the requisite bandwidth. Assuming 8 bit grayscale, you'll be dumping 2Mbytes/frame, at 50Hz that's 100Mbytes/s. A striped RAID 0 array of two contemporary off-the-shelf drives should be able to cope with it without problems.
If you are OK with burning some serious CPU or GPU for compression, but still want lossless storage, then JPEG2000 is the default choice. If you use a GPU implementation, it will leave your CPU alone for other things. I'd think the expected bandwidth reduction is 2x, so your RAID 0 will have plenty of bandwidth to spare. That would be the preferred way to use it - it will be very robust and you won't be losing any frames no matter what else the system is doing (within reason, of course).
If you are OK with lossy compression, then off-the-shelf jpeg libraries will do the trick. You'd probably want a 4x reduction in size, and the resultant 12.5Mbytes/s data stream can be handled by the hard drive the OS lives on.

As for the implementation: two threads are enough if there's no compression. One thread captures the images, another one dumps them to the drive. If you see no improvement compared to a single thread, then it's solely due to the bandwidth limitations of your drive. If you use GPU for compression, then one thread that handles compression is enough. If you use CPU for compression, then you need as many threads as there are cores.

There is no issue at all with storing image differences, in fact JPEG2k loves this and you my get an overall 2x compression improvement (for a total factor of 4x) if you're lucky. What you do is store a buch of difference frames for each reference frame stored in full. The ratio is based solely on the needs of the processing done afterwards - you're trading off resilience to data loss and interactive processing latency for decreased storage-time bandwidth.

I'd say anywhere between 1:5 and 1:50 ratio is reasonable. With the latter, the loss of the reference frame knocks out 1s worth of data, and randomly seeking anywhere in the data requires on average a read of a reference frame and 24 delta frames, plus the cost of decompressing 25 frames.

Question 3

Compression is the key here.

Imwrite docs

For JPEG, it can be a quality ( CV_IMWRITE_JPEG_QUALITY ) from 0 to 100 (the higher is the better). Default value is 95.

For PNG, it can be the compression level ( CV_IMWRITE_PNG_COMPRESSION ) from 0 to 9. A higher value means a smaller size and longer compression time. Default value is 3.

For PPM, PGM, or PBM, it can be a binary format flag ( CV_IMWRITE_PXM_BINARY ), 0 or 1. Default value is 1.

For .bmp format compression is not needed since it directly writes the bitmap.

In summary: Image write time png > jpg > bmp

If you don't care about the disk size, I would say go with .bmp format which is almost 10 times faster than writing a png and 6 times faster than writing a jpg.

Question 4

I would suggest taking a look into the QtMultimedia module, and if you are dealing with streams as opposed to images, try to convert your code to MPEG.

That will avoid dealing with every pixel all the time as only the pixel differences will be processed. That could potentially a give performance increase for the processing.

You could, for sure, also take a look at heftier compression algorithms, but that is outside the scope of Qt, and the Qt deal would probably be just interfacing the algorithms.

Question 5

You should have a queue of images to be processed. You should have a capture thread that captures images and places them on the queue. You should have a few compress/write threads that take images off the queue and compress/write them.

There's a reason CPUs have more than one core these days -- so you don't have to finish one thing before you start the next.

If you believe that this was what you were doing and you are still seeing the same issue, show us your code. You are most likely doing this incorrectly.

Update: As I suspected, you are using threads in a way that doesn't accomplish the objective for using threads in the first place. The whole point was to compress more than one image at a time because we know it takes 30 ms to compress an image and we know that compressing one image every 30 ms is insufficient. The way you are using threads, you still only try to compress one image at a time. So 30 ms to compress/write an image is still too long. The queue serves no purpose, since only one thread reads from it.