DCT implementation

https://stackoverflow.com/questions/23203778

07-07-2023
|

Pergunta

I'm trying to implement image compression algorithm based on DCT for color JPEG. I'm newbie in image processing so I need some help. What I need is clarification of an algorithm.

I'm using DCT implementation from here

So, here is the algorithm as I understood it:

Load an image using ImageIO into BufferedImage.

Create 3 matrices (1 for each channel: red, green, blue):

int rgb = bufferedImage.getRGB(i, j);
int red = (rgb >> 16) & 0xFF;
int green = (rgb >> 8) & 0xFF;
int blue = rgb & 0xFF;

Increase matrices to the size so they can be split in chunks 8x8 (where 8 is the size of DCT matrix, N)
For each matrix, split it into chunks of the size 8x8 (result: splittedImage)
Perform forwardDCT on matrices from splittedImage (result: dctImage).
Perform quantization on matrices from dctImage (result: quantizedImage)

Here I don't know what to do. I can:

merge quantizedImage matrices into one matrix margedImage, convert it into Vector and perform compressImage method.
or convert small matrices from quantizedImage into Vector and perform compressImage method on them, and then marge them into one matrix

So, here I got 3 matrices for red, green and blue colors. Than I convert those matrices into one RGB matrix and create new BufferedImage and using method setRGB to set pixel values. Then perform saving image to file.

Extra questions:

Is it better to convert RGB into YCbCr and perform DCT on Y, Cb and Cr?
Javadoc of compressImage method says that it's not Huffman Encoding, but Run-Length encoding. So will the compressed image be opened by image viewer? Or I should use Huffman Encoding according to JPEG specification, and is there any open source Huffman Encoding implementation in Java?

Solução

If you want to follow the implementation steps, I suggest reading:

http://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=sr_1_1?ie=UTF8&qid=1399765722&sr=8-1&keywords=compressed+image+file+formats

In regard your questions:

1) The JPEG standard knows nothing about color spaces and does not care whether you use RGB or YCbCr, or CMYK. There are several JPEG file format (e.g., JFIF, EXIF, ADOBE) that specify the color spaces--usually YCbCr.

The reason for using YCbCr is that if follows the JPEG trend of concentrating information. There tends to be more useful information in the Y component than the Cb or Cr components. Using YCbCr, you can sample 4 Ys for ever Cb and Cr (or even 16) for every Y. That reduces the amount of data to be compressed by 1/2.

Note that the JPEG file formats specify limits on sampling (JPEG allows 2:3 sampling while most implementations do not).

2) The DCT coefficients are Runlength encoded then huffman (or arithmetic) encoded. You have to use both.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow