Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can we distinguish deflate stream from deflateRaw stream?

Tags:

http

node.js

zlib

Some HTTP servers send deflate raw body (without zlib headers) instead of actual deflate body. See discussion at: Why do real-world servers prefer gzip over deflate encoding?

Is it possible to detect them and handle inflate properly in Node.js? I mean besides try to createInflate them and catch error then try createInflateRaw again.

like image 305
bitinn Avatar asked May 30 '16 07:05

bitinn


Video Answer


1 Answers

If the first byte in hex has a low nybble of 8, then it is a zlib stream. Otherwise it is a raw deflate stream. (Assuming that you know a priori that the only possible choices are a valid zlib stream or a valid deflate stream.) A raw deflate stream will never have an 8 in the low first nybble, but a zlib stream always will.

Background:

The zlib header format puts the compression method in the low nybble of the first byte. That compression method is always 8 for deflate.

The bit sequence in a raw deflate stream starts from the least significant bits of the bytes. If the first three bits are 000 (as they are for an 8), that signifies a stored (not compressed block), and it is not the last block. Stored blocks put the bytes of the input on byte boundaries. So the next thing that is done by the compressor after writing the 000 bits is to fill out the rest of the byte with zero bits to get to the next byte boundary. Therefore the next bit will never be a 1, so it is not possible for a valid deflate stream to have the first four bits be 1000, or the first nybble to be 8. (Note that the bits are read from the bottom up.)

The first (i.e. low) nybble of a valid deflate stream can only be 0..5 or a..d. If you see 6..9, e, or f, then it is not a valid deflate stream.

like image 85
Mark Adler Avatar answered Oct 18 '22 17:10

Mark Adler