How to join 2 jpegs together losslessly without decoding using a hex editor?

https://stackoverflow.com/questions/609586

jpeg

03-07-2019
|

Question

I am trying to write a program (prob in java) to join a number of jpegs together losslessly without decoding them first.

I thought I'd start simple and try and append 2 jpegs of the same size compressed with the same settings one above the other using a hex editor.

First I extract the image data of jpeg B and append it to jpeg A. By modifying the dimensions specified in the headers I get a new recognizable picture (jpeg A + jpeg B appended in the y axis) which can be diplayed. However, although the image data from jpeg B is clearly recognizable it seems to have lost a lot of colour information and is clearly incorrect.

So my question is what steps am I missing out here? I don't think there are any other dimension specific header values I need to change, so maybe I need to huffman decode the image data from both jpegs, then append them together and then reencode the lot?

I've spent some time reading up on jpeg specs and headers etc but to be honest I'm out of my depth and could really do with a pointer or two!

Thanks a lot for any help.

Thanks for all the suggestions. Yes this is definitely possible, I should have mentioned jpegtran in my original question. I am basically trying to replicate this aspect of jpegtran functionality but use it in my own program. I guess I should look at the jpegtran source but I know nothing about C and not very much about programming in general so reverse engineering source code is easier said than done!

Solution 2

Ok I worked out where I was going wrong.

1) the image scan data is saved in bytes, but the actual important info is encoded as variable length bit strings. This means that the end of the actual image data does not necessarily fall on a byte boundary. When the jpeg encoder needs to pad out the number of bits to make the byte boundary it simply adds a series of 1s.

2) the way the actual pixel info is stored is a little too complicated (at least for me) to explain, but basically everything is encoded within MCU, minimal coding units or something. These vary in size depending on the chroma subsampling, horizontal and vertical sizes being either 8 or 16 pixels. For each MCU, there are DC and AC parts that make up a single component of Luminance, Y, or chrominance, Cb and Cr. The problem was that the DC components are stored as values in relation to the relevant DC value of the previous MCU. So when I added the new image data from jpg B, it had stored its DC values in relation to 0 (because there were no previous MCUs), but it needed to take into account the final DC values of the last MCU from jpg A. (hope that makes sense).

The solution:

You need to do an initial decode (Huffman + runlength) of the image data to find out exactly where the image data ends and then strip the trailing 1s. You also need to change the initial DC values in the second jpg appropriately. You then need to reencode the appropriate bits, add 1s to fit to a byte boundary, et voila.

If you want to append in the x-axis, its a little more complicated. You have to rearrange the MCUs so that they scan in the right order. Jpgs scan from left to right, then top to bottom and then adjust the DC values appropriately.

So far I've only tested this on single MCU jpgs, but theoretically it should work with bigger ones too.

BTW I only worked this out thanks to the owner of this excellent jpg related resource/blog

OTHER TIPS

This is very much doable. I did it on a lot of Google map image tiles to join those and form a poster size image. There is a package for Unix called JPEG Tools for doing exactly this. The program is called jpegjoin. Pure C source, with Windows binaries available. When compiled it creates a command line application which when run joins two jpeg images loselessly among many other things. It does NOT de-compress any image, just merges the compressed data together and fixes the header accordingly. I used it to merge 100 images to create 50 strips and then merged those strips again to create a large image.

More information can be found at http://en.wikipedia.org/wiki/Lossy_compression#Lossless_editing

Source code

Source code for the underlying jpegtran library can be found here. An example script to mimic jpegjoin is here.

jpeg is - like mp3 - typically stable when you recompress it (using the same algorithm).

so, when you join the images and recompress them, just make sure that the new compression rate is higher or equal to the highest of the 2 pictures. that way you won't really lose accuracy.

Two approaches:

1) decode both source JPEG images, merge the resulting bitmaps and encode again as JPEG. Disadvantage here is the re-compression.

2) Ensure that the source image width and height are multiples of 16, possibly by cropping the images. Do not decode the images but instead assemble the target JPEG from the source MCU blocks (16 x 16 pixles size, therefore the cropping).

I suggest you to consider the DRI and RSTn markers, although this requires a lot of preconditions, but it works for me: appending a PPM(a bitmap format) to a jpeg which is encoded using DRI and RSTn, both are in same width, and both are in multiples of MCU.

I just cut out the jpeg data after the last RSTn marker(if any), decode this to PPM, and join the two PPMs to a single MCU line, encode it with the same options with the original jpeg, rearrange the RSTs in the resulting jpeg according to the original jpeg, then appending the result to the original jpeg. Using this method, it prevents me from decoding the whole original jpeg and encoding it again.

As the decoding and encoding stuff, I'm using libjpeg.

Instead of appending images horizontally, it's probably easier to rotate each image 90 degrees, then join them vertically, then rotate the resulting image back -90 degrees.

Lossless JPEG operations are pretty much an oxymoron. You can do some trickery, like rotating, but that's pretty much it.

I'd suggest you to have lossless source images (TIFF or PNG comes to mind), or re-evaluate your image quality requirements – resampling a JPEG once again, given decent original files, is indistinguishable to the vast majority of people out there.

What you're trying to to is basically not possible.
The encoding of the JPEG file is somewhat complex and if you're changing the content of the pixels then the encoding is going to be changed. You might end up with an image smaller or larger than the sum of the two images you combine. The only operations which are possible losslessly are ones who maintain one-to-one correspondence with the pixels of the original image. this basically boils down to 90 degree rotations.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow