Question

i have just downloaded the latest win32 jpegtran.exe from http://jpegclub.org/jpegtran/ and observed the following:

i have prepared a 24 BPP jpeg test image with 14500 x 10000 pixels.

  • compressed size in file system is around 7.5 MB.
  • decompressing into memory (with some image viewer) inflates to around 450 MB.

monitoring the jpegtran.exe command line tool's memory consumption during lossless rotation (180) i can see the process consuming up to 900 MB memory!

i would have assumed that such jpeg lossless transformations don't require decoding the image file into memory and instead would just perform some mathematical transformations on the encoded file itself - keeping the memory footprint very low.

so which of the following is true?

  • some bug in this particular tool's implementation
  • some configuration switch i have missed
  • some misunderstanding at my end (i.e. jpeg lossless transformations also need to decode the image into memory?)
  • the "mathematical operations" consuming even more memory than "decoding the image into memory"

edit:

according to the answer by JasonD the reason seems to be the latter one. so i'll extend my question:

are there any implementations that can do those operations in small chunks (to avoid high memory usage)? or does it always need to be done on the whole and there's no way around it?

PS:
i'm not planning to implement my own codec / algorithm. instead i'm asking if there are any implementations out there that meet my requirements. or if there could be in theory, at least.

Was it helpful?

Solution

I don't know about the library in question, but in order to perform a lossless rotation on a jpeg image, you would at least have to decompress the DCT coefficients in order to rotate them, and then re-compress.

The DCT coefficients, fully expanded, will be the same size or larger than the original image data, as they have more bits of information.

It's lossless, because the loss in a jpeg is caused by quantization of the DCT coefficients. So long as you don't decode/re-encode/re-quantize these, no loss will be incurred.

But it will be memory intensive.

jpeg compression works very roughly as follows:

  • Convert image into YCbCr colour space.
  • Optionally downsample some of the channels (colour error is less perceptible than luminance error, so it is typical to 2x downsample the chroma channels). This is obviously lossy, but very predictably/stably so.
  • Transform 8x8 blocks of the image by a discrete cosine transform (DCT), moving the image into frequency space. The DCT coefficients are also in an 8x8 block, and use more bits for storage than the 8-bit image data did.
  • Quantize the DCT coefficients by a variable amount (this is the quality setting in most packages). The aim is to produce as many small and especially zero coefficients as possible. The is the main "lossy" aspect of jpeg compression.
  • Zig-zag through the 2D data to turn it into a 1D stream of coefficients which is roughly in frequency order. High frequencies are more likely to be zero'd out, so many packets will ideally end in a stream of zeros which can be truncated.
  • Compress (non-lossily) the (now quite compressible) data using huffman encoding.

So a 'non-lossy' transformation would want to avoid doing as much as possible of that - especially anything beyond the DCT quantization, but that does not avoid expanding the data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top