Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The best way to Compress XML

Tags:

c#

I need to compress a very large xml file to the smallest possible size.

I work in C#, and I prefer it to be some open source or application that I can access thru my code, but I can handle an algorithm as well.

Thank you!

like image 302
Matan Avatar asked Jan 06 '10 11:01

Matan


People also ask

Can XML be compressed?

Using Custom CompressionYou can implement custom compression routine for use with you BDB XML whole document containers. When you do this, you must register the compression routine when you create and open your container, and you must always use the same compression for all subsequent uses of the container.

Does XML support binary?

Binary encodings for XMLThere are two methods of encoding binary data in an XML document. The base64Binary encoding makes better use of the available XML characters, and on average a base64-encoded binary field is 2/3 the size of its hexBinary equivalent.


2 Answers

It may not be the "smallest size possible", but you could use use System.IO.Compression to compress it. Zipping tends to provide very good compression for text.

using (var fileStream = File.OpenWrite(...))
using (var zipStream = new GZipStream(fileStream, CompressionMode.Compress))
{
    zipStream.Write(...);
}
like image 56
Kent Boogaart Avatar answered Oct 14 '22 13:10

Kent Boogaart


As stated above, Efficient XML Interchange (EXI) achieves the best available XML compression pretty consistently. Even without schemas, it is not uncommon for EXI to be 2-5 times smaller than zip. With schemas, you'll do even better.

If you're not opposed to a commercial implementation, you can use the .NET version of Efficient XML and call it directly from your C# code using standard .NET APIs. You can download a free trial copy from http://www.agiledelta.com/efx_download.html.

like image 26
John Schneider Avatar answered Oct 14 '22 13:10

John Schneider