Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pack arbitrary bit sequence in Python?

I want to encode/compress some binary image data as a sequence if bits. (This sequence will, in general, have a length that does not fit neatly in a whole number of standard integer types.)

How can I do this without wasting space? (I realize that, unless the sequence of bits has a "nice" length, there will always have to be a small amount [< 1 byte] of leftover space at the very end.)

FWIW, I estimate that, at most, 3 bits will be needed per symbol that I want to encode. Does Python have any built-in tools for this kind of work?

like image 898
kjo Avatar asked Feb 21 '11 12:02

kjo


2 Answers

There's nothing very convenient built in but there are third-party modules such as bitstring and bitarray which are designed for this.

from bitstring import BitArray
s = BitArray('0b11011')
s += '0b100'
s += 'uint:5=9'
s += [0, 1, 1, 0, 1]
...
s.tobytes()

To join together a sequence of 3-bit numbers (i.e. range 0->7) you could use

>>> symbols = [0, 4, 5, 3, 1, 1, 7, 6, 5, 2, 6, 2]
>>> BitArray().join(BitArray(uint=x, length=3) for x in symbols)
BitArray('0x12b27eab2')
>>> _.tobytes()
'\x12\xb2~\xab '

Some related questions:

  • What is the best way to do Bit Field manipulation in Python?
  • Python Bitstream implementations
like image 59
Scott Griffiths Avatar answered Sep 29 '22 06:09

Scott Griffiths


have you tried simply compressing the whole sequence with bz2? If the sequence is long you should use the bz2.BZ2Compressor to allow chunked processing, otherwise use bz2.compress on the whole thing. The compression will probably not be ideal but will typically get very close when dealing with sparse data.

hope that helps.

like image 36
Vukasin Toroman Avatar answered Sep 29 '22 07:09

Vukasin Toroman