Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to understand zlib/deflate in PNG files

I'm currently in the middle of writing a small PNG image I/O library myself for learning purposes. My problem is the following:

I created a small PNG only 2 by 2 pixels in dimension and opened it in a hex editor to study its contents. This is the image I created using GIMP and stored with a compression of "9".

(Please note that this is an enlarged image of the original 2 by 2 pixel image ;) )

a black, red, blue and green pixel in a two by two array of pixels.

So I guess uncompressed, this would look something like this in memory:

00 00 00 FF 00 00 00 00 FF 00 FF 00

when stored without an alpha channel.

(I only stated this here for clarity. I know of compression and did not expect to see this byte pattern in the file).

I extracted the IDAT chunk and stripped the chunk ID ("IDAT") and the tailing CRC value and got this sequence of bytes:

08 D7 05 C1 01 01 00 00 00 80 10 FF 4F 17 10 48 06 0F FE 02 FE

Now the first two bytes 08 D7 contain information on the encoded block. And the last four bytes 0F FE 02 FE must be the ADLER32 checksum.

This ultimately leaves me with the following bytes:

05 C1 01 01 00 00 00 80 10 FF 4F 17 10 48 06

Written in binary representation these bytes are:

0000 0101  1100 0001  0000 0001  0000 0001
0000 0000  0000 0000  0000 0000  1000 0000
0001 0000  1111 1111  0100 1111  0001 0111
0001 0000  0100 1000  0000 0110

To understand DEFLATE better I tried to "unpack" this sequence by hand at least before I understand it well enough to write a small tool. But I got stuck really quickly.

RFC 1951 ("DEFLATE Compressed Data Format Specification") states that every encoded block begins with a three bit header. One bit indicating whether or not this is the last block and two more blocks indicating the compression method. Since I assume the encoder used only one block here (meaning the first block automatically is the last) and used a non static Huffmann tree, I am looking for the bit sequence "101" but I cannot find it (nor do I find other likely headers "100" or "110").

Also the RFC says that there must be two two byte values LEN and NLEN storing the length of the block where NLEN is the one's complement of LEN but again I am unable to find four such bytes that satisfy this condition. I won't even start on my luck of finding anything that could represend the two Huffmann trees.

I read RFCs 1951 and 1950 ("ZLIB Compressed Data Format Specification" as well as the Wikipedia articles on zlib, DEFLATE, LZ77 and Huffman coding, as well as several small and rather unhelpful articles on the web and a few answers on Stack Overflow, but none could help me with my lack of understanding.

I would be really grateful for any help or hint!

like image 422
Rafael Pasquay Avatar asked Sep 24 '14 13:09

Rafael Pasquay


2 Answers

In case this helps, here is a disassembly of the IDAT chunk contents:

! infgen 2.2 output
!
zlib
!
last
dynamic
count 257 2 18
code 1 1
code 2 2
code 18 2
lens 1
zeros 138
zeros 116
lens 2 2 1 1
! litlen 0 1
! litlen 255 2
! litlen 256 2
! dist 0 1
! dist 1 1
literal 0 0 0 0 255 0 0 0 0 0 255 0 255 0
end
!
adler

You can get the infgen source code here.

like image 170
Mark Adler Avatar answered Sep 30 '22 17:09

Mark Adler


I think you are missing how bits are packed inside bytes (see eg section 3.1.1 of RFC)

 Data elements are packed into bytes in order of
 increasing bit number within the byte, i.e., starting
 with the least-significant bit of the byte.

Hence, if the first byte is 05 = 0000 0101 the first bit is 1.

(BTW, it's surely quite instructive to look things at so much detail, but I wonder if you are not going a little too far if your intent is to understand PNG. )

Further, when you come to the point were you find the uncompressed IDAT stream, bear in mind that the pixels are coded with one of five filters per row, and that there is an extra byte at the start of each row that signals the filter type. So, you won't really find the raw 12 bytes 00 00 00 FF 00 00 00 00 FF 00 FF 00, but 12+2=14 bytes instead.

like image 41
leonbloy Avatar answered Sep 30 '22 17:09

leonbloy