I need to compress an array of bytes. So I wrote this snippet :
class Program
{
static void Main()
{
var test = "foo bar baz";
var compressed = Compress(Encoding.UTF8.GetBytes(test));
var decompressed = Decompress(compressed);
Console.WriteLine("size of initial table = " + test.Length);
Console.WriteLine("size of compressed table = " + compressed.Length);
Console.WriteLine("size of decompressed table = " + decompressed.Length);
Console.WriteLine(Encoding.UTF8.GetString(decompressed));
Console.ReadKey();
}
static byte[] Compress(byte[] data)
{
using (var compressedStream = new MemoryStream())
using (var zipStream = new GZipStream(compressedStream, CompressionMode.Compress))
{
zipStream.Write(data, 0, data.Length);
zipStream.Close();
return compressedStream.ToArray();
}
}
static byte[] Decompress(byte[] data)
{
using (var compressedStream = new MemoryStream(data))
using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (var resultStream = new MemoryStream())
{
zipStream.CopyTo(resultStream);
return resultStream.ToArray();
}
}
}
The problem is that I get this output :
I don't understand why the size of the compressed array is greater than the decompressed one !
Any ideas?
Edit
after @spender's comment: if I change test
string for example :
var test = "foo bar baz very long string for example hdgfgfhfghfghfghfghfghfghfghfghfghfghfhg";
I get different result. So what is the minimum size of the initial array to be compressed ?
gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU (from where the "g" of gzip is derived).
GZIP can reduce the amount of data by up to 70%. Not bad, except tests comparing compressed file sizes across different compression algorithms have shown that alternative algorithms like Brotli outperform GZIP for text-based assets.
TL;DR: gzip is better than compress . compress is slower than gzip -1 when compressing, it compresses only half as well, but. it is 29% faster when decompressing.
However, in practice, GZIP performs best on text-based content, often achieving compression rates of as high as 70-90% for larger files, whereas running GZIP on assets that are already compressed via alternative algorithms (for example, most image formats) yields little to no improvement.
This is because the amount of data is so small that the overheads of the compression format outweigh the gain of compression.
Try more data.
If you compressed entirely random data (or already compressed data such as jpeg), you would never make any significant gain. However the string new String('*',1000000)
would compress down really nicely.
GZIP adds at least 18 bytes, so anything below, or marginally above this size that is easily compressible will not benefit.
Here's an interesting question that probes further into GZIP: What's the most that GZIP or DEFLATE can increase a file size?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With