Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compress a .net object instance using gzip

I am wanting to compress results from QUERYS of the database before adding them to the cache.

I want to be able to compress any reference type.

I have a working version of this for compressing strings.. the idea based on scott hanselman 's blog post http://shrinkster.com/173t

any ideas for compressing a .net object?

I know that it will be a read only cache since the objects in the cache will just be byte arrays..

like image 315
Matt Peters Avatar asked Jun 08 '09 12:06

Matt Peters


People also ask

What is GZip in C#?

You have probably seen compressed files with the “gz” extension. These are files that hold a single compressed file according to the GZIP specifications. GZip files are represented by the GZipStream object in .

How to compress and decompress string in c#?

Use GZip to Decompress a String in C# The GZipStream will continue to enclose it, but the flow will now be reversed such that reading data from the GZipStream will convert the compressed data into uncompressed data. The CompressionMode. Decompress mode is used to decompress a string.


3 Answers

This won't work for any reference type. This will work for Serializable types. Hook up a BinaryFormatter to a compression stream which is piped to a file:

var formatter = new BinaryFormatter();
using (var outputFile = new FileStream("OutputFile", FileMode.CreateNew))
using (var compressionStream = new GZipStream(
                         outputFile, CompressionMode.Compress)) {
   formatter.Serialize(compressionStream, objToSerialize);
   compressionStream.Flush();
}

You could use a MemoryStream to hold the contents in memory, rather than writing to a file. I doubt this is really an effective solution for a cache, however.

like image 54
mmx Avatar answered Oct 05 '22 10:10

mmx


What sort of objects are you putting in the cache? Are they typed objects? Or things like DataTable? For DataTable, then perhaps store as xml compressed through GZipStream. For typed (entity) objects, you'll probably need to serialize them.

You could use BinaryFormatter and GZipStream, or you could just use something like protobuf-net serialization (free) which is already very compact (adding GZipStream typically makes the data larger - which is typical of dense binary). In particular, the advantage of things like protobuf-net is that you get the reduced size without having to pay the CPU cost of unzipping it during deserialization. In some tests before adding GZipStream, it was 4 times faster than BinaryFormatter. Add the extra time onto BinaryFormatter for GZip and it should win by a considerable margin.

like image 26
Marc Gravell Avatar answered Oct 05 '22 09:10

Marc Gravell


I just added GZipStream support for my app today, so I can share some code here;

Serialization:

using (Stream s = File.Create(PathName))
{
    RijndaelManaged rm = new RijndaelManaged();
    rm.Key = CryptoKey;
    rm.IV = CryptoIV;
    using (CryptoStream cs = new CryptoStream(s, rm.CreateEncryptor(), CryptoStreamMode.Write))
    {
        using (GZipStream gs = new GZipStream(cs, CompressionMode.Compress))
        {
            BinaryFormatter bf = new BinaryFormatter();
            bf.Serialize(gs, _instance);
        }
    }
}

Deserialization:

using (Stream s = File.OpenRead(PathName))
{
    RijndaelManaged rm = new RijndaelManaged();
    rm.Key = CryptoKey;
    rm.IV = CryptoIV;
    using (CryptoStream cs = new CryptoStream(s, rm.CreateDecryptor(), CryptoStreamMode.Read))
    {
        using (GZipStream gs = new GZipStream(cs, CompressionMode.Decompress))
        {
            BinaryFormatter bf = new BinaryFormatter();
            _instance = (Storage)bf.Deserialize(gs);
        }
    }
}

NOTE: if you use CryptoStream, it is kinda important that you chain (un)zipping and (de)crypting right this way, because you'll want to lose your entropy BEFORE encryption creates noise from your data.

like image 21
Daniel Mošmondor Avatar answered Oct 05 '22 10:10

Daniel Mošmondor