Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a name for this compression algorithm?

Tags:

compression

Say you have a four byte integer and you want to compress it to fewer bytes. You are able to compress it because smaller values are more probable than larger values (i.e., the probability of a value decreases with its magnitude). You apply the following scheme, to produce a 1, 2, 3 or 4 byte result:

Note that in the description below (the bits are one-based and go from most significant to least significant), i.e., the first bit refers to most significant bit, the second bit to the next most significant bit, etc...)

  1. If n<128, you encode it as a single byte with the first bit set to zero
  2. If n>=128 and n<16,384 , you use a two byte integer. You set the first bit to one, to indicate and the second bit to zero. Then you use the remaining 14 bits to encode the number n.
  3. If n>16,384 and n<2,097,152 , you use a three byte integer. You set the first bit to one, the second bit to one, and the third bit to zero. You use the remaining 21 bits, to encode n.
  4. If n>2,097,152 and n<268,435,456 , you use a four byte integer. You set the first three bits to one and the fourth bit to zero. You use the remaining 28 bits to encode n.
  5. If n>=268,435,456 and n<4,294,967,296, you use a five byte integer. You set the first four bits to one and use the following 32-bits to set the exact value of n, as a four byte integer. The remainder of the bits is unused.

Is there a name for this algorithm?

like image 577
Michael Goldshteyn Avatar asked Apr 11 '11 18:04

Michael Goldshteyn


1 Answers

This is quite close to variable-length quantity encoding or base-128. The latter name stems from the fact that each 7-bit unit in your encoding can be considered a base-128 digit.

like image 57
Fred Foo Avatar answered Jan 03 '23 00:01

Fred Foo