Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to encode bit-vectors?

Currently using the run length encoding for encoding bit-vectors, and the current run time is 2log(i), where is the size of the run. Is there another way of doing it to bring it down to log(i)? Thanks.

like image 733
Chocolate Drop Avatar asked Oct 13 '12 22:10

Chocolate Drop


1 Answers

The most efficient way of encoding a bit vector is to isolate any specific properties of the bit source. If it is totally random, there is no real noticeable gain (actually, a totally random stream of bit cannot be compressed in any way).

If you can find properties in your bit stream you could try to define a collection of vectors which will define the base of a Vector Space. In such case, the result will be very efficient.

We'll need a few more details on your bit stream.


(Edit)

Just a few more details to understand the previous statement: "a totally random stream of bits cannot be compressed in any way"

It is not possible to compress a totally random vector of bits if by "compress" we mean the "transformed/compressed stream" plus the "vector base definition" plus the decompression program. But in most cases the decompression program (and often the vector base too) is embedded in client software. Thus, only the "compressed stream" is needed.

A good explanation (and funny story) about that is Patrick Craig 5000$ compression challenge

More scientific the theory of information, especially entropy section

And, the final one, the full story.

But whatever the solution is, if you have an unknown number of unknown streams to compress you won't be ale to do anything. You have to find a pattern.

like image 56
Askolein Avatar answered Nov 08 '22 21:11

Askolein