Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gzip compression and decompression in C#

Tags:

c#

gzip

I'm trying to compress an string in one module and decompressing it in another module. Here is the code I'm using.

Compress

public static string CompressString(string text)
{
    byte[] buffer = Encoding.ASCII.GetBytes(text);
    MemoryStream ms = new MemoryStream();
    using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
    {
         zip.Write(buffer, 0, buffer.Length);
    }

    ms.Position = 0;
    MemoryStream outStream = new MemoryStream();

    byte[] compressed = new byte[ms.Length];
    ms.Read(compressed, 0, compressed.Length);

    byte[] gzBuffer = new byte[compressed.Length + 4];
    System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
    System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
    return Convert.ToBase64String(gzBuffer);
}

Decompress

public static byte[] DecompressString(byte[] data)
{
   using (var compressedStream = new MemoryStream(data))
   using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
     using (var resultStream = new MemoryStream())
     {
        zipStream.CopyTo(resultStream);
        return resultStream.ToArray();
     }
}

Using it as:

 DecompressString(System.Text.Encoding.ASCII.GetBytes(ip));

But, for above statement, I'm getting following error.

{"The magic number in GZip header is not correct. Make sure you are passing in a GZip stream."} System.SystemException {System.IO.InvalidDataException}

like image 280
benjamin54 Avatar asked Aug 05 '14 09:08

benjamin54


People also ask

What is gzip decompression?

gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU (from where the "g" of gzip is derived).

Which is better compress or gzip?

TL;DR: gzip is better than compress . compress is slower than gzip -1 when compressing, it compresses only half as well, but. it is 29% faster when decompressing.

What is gzip and DEFLATE encoding?

gzip: It is a compression format using the Lempel-Ziv coding (LZ77), with a 32-bit CRC. compress: It is a compression format using the Lempel-Ziv-Welch (LZW) algorithm. deflate: It is a compression format using the zlib structure, with the deflate compression algorithm.

What is the difference between GZ and gzip?

The only difference is the name itself. GZIP is the file format, and GZ is the file extension used for GZIP compressed files.


1 Answers

Here is a rewrite of your code that should work the way you want it to.

I wrote it in LINQPad and it can be tested in that.

Note that there's very little error checking here. You should add checks to see if all read operations complete and has actually read what they were supposed to and similar checks.

The output

original: 256
This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test.  

compressed: 56
AAEAAB+LCAAAAAAABAALycgsVgCiRIWS1OISPYWQEcYHANU9d5YAAQAA 

decompressed: 256
This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test. This is a test.  

The program

void Main()
{
    var input = "This is a test. This is a test. ";
    input += input;
    input += input;
    input += input;
    string compressed = Compress(input);
    string decompressed = Decompress(compressed);

    input.Dump("original: " + input.Length);
    compressed.Dump("compressed: " + compressed.Length);
    decompressed.Dump("decompressed: " + decompressed.Length);
}

public static string Decompress(string input)
{
    byte[] compressed = Convert.FromBase64String(input);
    byte[] decompressed = Decompress(compressed);
    return Encoding.UTF8.GetString(decompressed);
}

public static string Compress(string input)
{
    byte[] encoded = Encoding.UTF8.GetBytes(input);
    byte[] compressed = Compress(encoded);
    return Convert.ToBase64String(compressed);
}

public static byte[] Decompress(byte[] input)
{
    using (var source = new MemoryStream(input))
    {
        byte[] lengthBytes = new byte[4];
        source.Read(lengthBytes, 0, 4);

        var length = BitConverter.ToInt32(lengthBytes, 0);
        using (var decompressionStream = new GZipStream(source,
            CompressionMode.Decompress))
        {
            var result = new byte[length];
            decompressionStream.Read(result, 0, length);
            return result;
        }
    }
}

public static byte[] Compress(byte[] input)
{
    using (var result = new MemoryStream())
    {
        var lengthBytes = BitConverter.GetBytes(input.Length);
        result.Write(lengthBytes, 0, 4);

        using (var compressionStream = new GZipStream(result,
            CompressionMode.Compress))
        {
            compressionStream.Write(input, 0, input.Length);
            compressionStream.Flush();

        }
        return result.ToArray();
    }
}
like image 109
Lasse V. Karlsen Avatar answered Sep 23 '22 21:09

Lasse V. Karlsen