Pergunta

I've been trying to decompress GIF's in PHP and seem to have everything except the LZW decompression down. I have saved an image that is shown: sample image

This image is 3 x 5 like this:

Blue  Black Black
Black Blue  Black
Black Black Black
White White White
White White White

I decided to go through manually in Binary and parse this file. The result of manual parsing is below. I am still stuck as to how to decode the raster data here. Can someone break down how the raster data becomes the image? I've been able to break down one image, but nothing else (not this image). I have posted my understanding of how this should break down, but I am obviously doing it wrong.

01000111 G
01001001 I
01000110 F
00111000 8
00111001 9
01100001 a

Screen Descriptor
WIDTH
00000011 3
00000000

00000101 5
00000000

10010001 GCM (1), CR (001), BPP (001), CD = 2, COLORS = 4

00000000 BGCOLOR Index

00000000 Aspect Ratio

GCM
BLUE
00110101 | 53
00000000 | 0
11000001 | 193

WHITE
11111111 | 255
11111111 | 255
11111111 | 255

BLACK
00000000 | 0
00000000 | 0
00000000 | 0

00000000 | 0
00000000 | 0
00000000 | 0

Extension
00100001 | 21
Function Code
11111001 | F9
Length
00000100 | 4
00000000
00000000
00000000
00000000
Terminator
00000000

Local Descriptor
00101100 Header
XPOS
00000000 | 0
00000000

YPOS
00000000 | 0
00000000

Width
00000011 | 3
00000000

Height
00000101 | 5
00000000

Flags
00000000 (LCM = 0, Interlaced = 0, Sorted = 0, Reserved = 0, Pixel Bits = 0)

RASTER DATA
Initial Code Size
00000010 | 2
Length
00000101 | 5

Data
10000100
01101110
00100111
11000001
01011101

Terminator
00000000

00111011 | ;
00000000

My Attempt

10000100
01101110
00100111
11000001
01011101

Initial Code Size = 3 Read 2 bits at a time

10
00
Append last bit to first (010)
String becomes 010 or 2. 2 would be color # 3 or BLACK

At this point, I am already wrong. The first color should be blue.

Resources I have been using:

http://www.daubnet.com/en/file-format-gif http://en.wikipedia.org/wiki/Graphics_Interchange_Format http://www.w3.org/Graphics/GIF/spec-gif87.txt

Foi útil?

Solução

GIF parser

You said you want to write your own GIF parser in order to understand how it works. I suggest you look at the source code of any of the libraries containing GIF readers, such as the de-facto reference implementation GIFLIB. The relevant source file is dgif_lib.c; start at slurp for decoding, or jump to the LZW decompression implementation.

Here's how your image decodes.

I think the issue was that you were splitting the input bytes into LZW codes incorrectly.

Number of colors is (0b001 + 1) * 2 = 4.

Code size starts at 2 + 1 = 3 bits.

So the initial dictionary is

000 = color 0 = [blue]
001 = color 1 = [white]
010 = color 2 = [black]
011 = color 3 = [black]
100 = clear dictionary
101 = end of data

Now, GIF packs LZW codes into bytes in LSB-first order. Accordingly, the first code is stored as the 3 least-significant bits of the first byte; the second code as the next 3 bits; and so on. In your example (first byte: 0x84 = 10000100), the first 2 codes are thus 100 (clear) and 000 (blue). The whole thing

01011101 11000001 00100111 01101110 10000100

is split into codes (switches to 4-bit groups after reading the highest 3-bit code, 111) as

0101 1101 1100 0001 0010 0111 0110 111 010 000 100

This decodes to:

     last
code code
 100      clear dictionary
 000      output [blue] (1st pixel)
 010  000 new code in table:
              output 010 = [black]
              add 110 = old + 1st byte of new = [blue black] to table
 111  010 new code not in table:
              output last string followed by copy of first byte, [black black]
              add 111 = [black black] to table
              111 is largest possible 3-bit code, so switch to 4 bits
0110 0111 new code in table:
              output 0110 = [blue black]
              add 1000 = old + 1st byte of new = [black black blue] to table
0111 0110 new code in table:
              output 0111 = [black black]
              add 1001 = old + 1st byte of new = [blue black black] to table
...

So the output starts (wrapping to 3 columns):

blue  black black
black blue  black
black black ...

which is what you wanted.

Outras dicas

This site is an excellent resource about the GIF format, and offers a great explanation of the LZW compression and decompression process:

http://www.matthewflickinger.com/lab/whatsinagif/index.html

Solution without writing your own GIF reader

For uses other than your own edification, try this.

A few notes

  • Your GIF file is GIF89a. You linked to the GIF87a specification; the 89a specification is here.
  • You're seem to be concerned that using a library to parse the image will hurt performance. This makes no sense. The libraries are generally implemented in optimized C; your hand-rolled solution would be written in PHP, an interpreted language.
  • You mentioned PCX, which libraries like imagemagick do support.

Or just use PNG

According to the ZPL 2 programming manual, PNG is supported. For example, the ~DY (Download Graphics) command takes a b (format) parameter, for which P (PNG) is an option, besides the default GRF. See also Printing PNG images to a zebra network printer.

There are lots of libraries for converting GIF to PNG. You could use ImageMagick (PHP binding), or just use the PHP functions imagecreatefromgif and imagepng.

I can't help you with the LZW decoding, but wouldn't it be easier to just use a library function like imagecreatefromgif() from the PHP GD extension to parse the GIF file and extract the image data, which you can then convert to your target format?

It is good you want to know how to do LZW without using libraries written by someone else. LZW does not decode images pixel by pixel. It looks for repeat blocks in the data stream, saves them in a dictionary and refers back to them. If 100 pixels are repeated somewhere only one code is used to reproduce that 100 pixels instead of 100 as with bitmap (BMP) images. This is why GIF is great for diagrams where you might have many series of 100 white pixels followed by a few black ones to draw a line. On the other hand, it is lousy for photographs because there are very few long repeats and GIF is generally restricted to 256 colours unless you use some complicated tricks.

The codes used in the compressed file are longer than the colour codes for each pixel in the original image. It is only because long repeat blocks are common in diagrams that massive compression is possible.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top