Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is best compression algorithm for integers?

I want the best compression algorithm for a list of random numbers.

List example:

224.19
225.57
226.09
222.74
222.20
222.11
223.14
540.56
538.96
540.14
540.44
336.45
338.47
340.78
156.73
160.02
158.56
156.23
55.08
56.33
54.88
53.45

I can skip the fractional part. I have huge list of numbers just like the example given above that's why it needs to be compressed.

Can you recommend something?

like image 713
Waqas Avatar asked Feb 16 '23 04:02

Waqas


1 Answers

Don't use floats, use integers with some sort of control character to represent the decimal point if you need it, but if you can skip it, all the better.

Take a look at Variable byte encodings. It's advantage is that you don't need to allocate 64 bits of memory for small integers.

If your numbers have some dependency between each other you could look into Delta encoding - it stores the difference between two numbers rather than the numbers itself.

Variable byte encoding and delta encoding are used as the core methods of compressing indexes of inverted lists by Google and any other company dealing with search engines.

like image 113
XapaJIaMnu Avatar answered Feb 18 '23 16:02

XapaJIaMnu