Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Gzip to compress/decompress an array of bytes

I need to compress an array of bytes. So I wrote this snippet :

 class Program
    {
        static void Main()
        {
            var test = "foo bar baz";

            var compressed = Compress(Encoding.UTF8.GetBytes(test));
            var decompressed = Decompress(compressed);
            Console.WriteLine("size of initial table = " + test.Length);
            Console.WriteLine("size of compressed table = " + compressed.Length);
            Console.WriteLine("size of  decompressed table = " + decompressed.Length);
            Console.WriteLine(Encoding.UTF8.GetString(decompressed));
            Console.ReadKey();
        }

        static byte[] Compress(byte[] data)
        {
            using (var compressedStream = new MemoryStream())
            using (var zipStream = new GZipStream(compressedStream, CompressionMode.Compress))
            {
                zipStream.Write(data, 0, data.Length);
                zipStream.Close();
                return compressedStream.ToArray();
            }
        }

        static byte[] Decompress(byte[] data)
        {
            using (var compressedStream = new MemoryStream(data))
            using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
            using (var resultStream = new MemoryStream())
            {
                zipStream.CopyTo(resultStream);
                return resultStream.ToArray();
            }
        }
    }

The problem is that I get this output :

output

I don't understand why the size of the compressed array is greater than the decompressed one !

Any ideas?

Edit

after @spender's comment: if I change test string for example :

var test = "foo bar baz very long string for example hdgfgfhfghfghfghfghfghfghfghfghfghfghfhg";

I get different result. So what is the minimum size of the initial array to be compressed ?

like image 258
Lamloumi Afif Avatar asked Dec 01 '16 11:12

Lamloumi Afif


People also ask

What is gzip decompression?

gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU (from where the "g" of gzip is derived).

How much does gzip compress a file?

GZIP can reduce the amount of data by up to 70%. Not bad, except tests comparing compressed file sizes across different compression algorithms have shown that alternative algorithms like Brotli outperform GZIP for text-based assets.

Which is better compress or gzip?

TL;DR: gzip is better than compress . compress is slower than gzip -1 when compressing, it compresses only half as well, but. it is 29% faster when decompressing.

How efficient is gzip compression?

However, in practice, GZIP performs best on text-based content, often achieving compression rates of as high as 70-90% for larger files, whereas running GZIP on assets that are already compressed via alternative algorithms (for example, most image formats) yields little to no improvement.


1 Answers

This is because the amount of data is so small that the overheads of the compression format outweigh the gain of compression.

Try more data.

If you compressed entirely random data (or already compressed data such as jpeg), you would never make any significant gain. However the string new String('*',1000000) would compress down really nicely.

GZIP adds at least 18 bytes, so anything below, or marginally above this size that is easily compressible will not benefit.

Here's an interesting question that probes further into GZIP: What's the most that GZIP or DEFLATE can increase a file size?

like image 132
spender Avatar answered Oct 03 '22 23:10

spender