Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The fastest GZIP decompress library in .NET [closed]

Which .NET library has the fastest decompress performance (in terms of throughput)?

There are quite a few libraries out there...

  • GZipStream
  • DotNetZip
  • Xceed Zip for .NET
  • SevenZipLib
  • SharpZipLib | community sponsor of Xceed Zip for .NET

...and I expect there are more I haven't listed.

Has anyone seen a benchmark of the throughput performance of these GZIP libraries? I'm interested in decompression throughput, but I'd like to see the results for compression too.

like image 594
Rudiger Avatar asked Jul 20 '10 15:07

Rudiger


People also ask

What is GZIP in C#?

You have probably seen compressed files with the “gz” extension. These are files that hold a single compressed file according to the GZIP specifications. GZip files are represented by the GZipStream object in .

Does zlib support GZIP?

For applications that require data compression, the functions in this module allow compression and decompression, using the zlib library. The zlib library has its own home page at https://www.zlib.net.

What is the compression ratio of GZIP?

GZIP provides good enough compression ratio between 2.5 and 3 for text and it is fast, it is fast to compress data and it is fast to deCOM press it.


2 Answers

I found problems with Microsoft's GZipStream implementation not being able to read certain gzip files, so I have been testing a few libraries.

This is a basic test I adapted for you to run, tweak, and decide:

using System;
using System.Diagnostics;
using System.IO;
using System.IO.Compression;
using NUnit.Framework;
using Ionic.Zlib;
using ICSharpCode.SharpZipLib.GZip;

namespace ZipTests
{
    [TestFixture]
    public class ZipTests
    {
        MemoryStream input, compressed, decompressed;
        Stream compressor;
        int inputSize;
        Stopwatch timer;

        public ZipTests()
        {
            string testFile = "TestFile.pdf";
            using(var file = File.OpenRead(testFile))
            {
                inputSize = (int)file.Length;
                Console.WriteLine("Reading " + inputSize + " from " + testFile);
                var ms = new MemoryStream(inputSize);
                file.Read(ms.GetBuffer(), 0, inputSize);
                ms.Position = 0;
                input = ms;
            }
            compressed = new MemoryStream();
        }

        void StartCompression()
        {
            Console.WriteLine("Using " + compressor.GetType() + ":");
            GC.Collect(2, GCCollectionMode.Forced); // Start fresh
            timer = Stopwatch.StartNew();
        }

        public void EndCompression()
        {
            timer.Stop();
            Console.WriteLine("  took " + timer.Elapsed
                + " to compress " + inputSize.ToString("#,0") + " bytes into "
                + compressed.Length.ToString("#,0"));
            decompressed = new MemoryStream(inputSize);
            compressed.Position = 0; // Rewind!
            timer.Restart();
        }

        public void AfterDecompression()
        {
            timer.Stop();
            Console.WriteLine("  then " + timer.Elapsed + " to decompress.");
            Assert.AreEqual(inputSize, decompressed.Length);
            Assert.AreEqual(input.GetBuffer(), decompressed.GetBuffer());
            input.Dispose();
            compressed.Dispose();
            decompressed.Dispose();
        }

        [Test]
        public void TestGZipStream()
        {
            compressor = new System.IO.Compression.GZipStream(compressed, System.IO.Compression.CompressionMode.Compress, true);
            StartCompression();
            compressor.Write(input.GetBuffer(), 0, inputSize);
            compressor.Close();

            EndCompression();

            var decompressor = new System.IO.Compression.GZipStream(compressed, System.IO.Compression.CompressionMode.Decompress, true);
            decompressor.CopyTo(decompressed);

            AfterDecompression();
        }

        [Test]
        public void TestDotNetZip()
        {
            compressor = new Ionic.Zlib.GZipStream(compressed, Ionic.Zlib.CompressionMode.Compress, true);
            StartCompression();
            compressor.Write(input.GetBuffer(), 0, inputSize);
            compressor.Close();

            EndCompression();

            var decompressor = new Ionic.Zlib.GZipStream(compressed,
                                    Ionic.Zlib.CompressionMode.Decompress, true);
            decompressor.CopyTo(decompressed);

            AfterDecompression();
        }

        [Test]
        public void TestSharpZlib()
        {
            compressor = new ICSharpCode.SharpZipLib.GZip.GZipOutputStream(compressed)
            { IsStreamOwner = false };
            StartCompression();
            compressor.Write(input.GetBuffer(), 0, inputSize);
            compressor.Close();

            EndCompression();

            var decompressor = new ICSharpCode.SharpZipLib.GZip.GZipInputStream(compressed);
            decompressor.CopyTo(decompressed);

            AfterDecompression();
        }

        static void Main()
        {
            Console.WriteLine("Running CLR version " + Environment.Version +
                " on " + Environment.OSVersion);
            Assert.AreEqual(1,1); // Preload NUnit
            new ZipTests().TestGZipStream();
            new ZipTests().TestDotNetZip();
            new ZipTests().TestSharpZlib();
        }
    }
}

And the result in the system I am currently running (Mono on Linux), is as follows:

Running Mono CLR version 4.0.30319.1 on Unix 3.2.0.29
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using System.IO.Compression.GZipStream:
  took 00:00:03.3058572 to compress 37,711,561 bytes into 33,438,894
  then 00:00:00.5331546 to decompress.
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using Ionic.Zlib.GZipStream:
  took 00:00:08.9531478 to compress 37,711,561 bytes into 33,437,891
  then 00:00:01.8047543 to decompress.
Reading 37711561 from /home/agustin/Incoming/ZipTests/TestFile.pdf
Using ICSharpCode.SharpZipLib.GZip.GZipOutputStream:
  took 00:00:07.4982231 to compress 37,711,561 bytes into 33,431,962
  then 00:00:02.4157496 to decompress.

Be warned that this is Mono's GZIP, and Microsoft's version will give its own results (and as I mentioned, just can't handle any gzip you give it)

This is what I got on a windows system:

Running CLR version 4.0.30319.1 on Microsoft Windows NT 5.1.2600 Service Pack 3
Reading 37711561 from TestFile.pdf
Using System.IO.Compression.GZipStream:
  took 00:00:03.3557061 to compress 37.711.561 bytes into 36.228.969
  then 00:00:00.7079438 to decompress.
Reading 37711561 from TestFile.pdf
Using Ionic.Zlib.GZipStream:
  took 00:00:23.4180958 to compress 37.711.561 bytes into 33.437.891
  then 00:00:03.5955664 to decompress.
Reading 37711561 from TestFile.pdf
Using ICSharpCode.SharpZipLib.GZip.GZipOutputStream:
  took 00:00:09.9157130 to compress 37.711.561 bytes into 33.431.962
  then 00:00:03.0983499 to decompress.

It is easy enough to add more tests...

like image 153
gatopeich Avatar answered Sep 20 '22 17:09

gatopeich


Compression performance benchmarks vary based on the size of streams being compressed and the precise content. If this is a particularly important performance bottleneck for you then it'd be worth your time to write a sample app using each library and running tests with your real files.

like image 37
Samuel Neff Avatar answered Sep 20 '22 17:09

Samuel Neff