Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unzipping a Stream in C#

Tags:

c#

.net

zip

I'm working in C#, and I'm downloading for the internet a zip file with one XML file in it. and I wish to load this XML file. This is what I have so far:

byte[] data;
WebClient webClient = new WebClient();
try {
    data = webClient.DownloadData(downloadUrl);
}
catch (Exception ex) {
    Console.WriteLine("Error in DownloadData (Ex:{0})", ex.Message);
    throw;
}

if (data == null) {
    Console.WriteLine("Bulk data is null");
    throw new Exception("Bulk data is null");
}

//Create the stream
MemoryStream stream = new MemoryStream(data);
XmlDocument document = new XmlDocument();

//Gzip
GZipStream gzipStream = new GZipStream(stream, CompressionMode.Decompress);

//Load report straight from the gzip stream
try {
    document.Load(gzipStream);
}
catch (Exception ex) {
    Console.WriteLine("Error in Load (Ex:{0})", ex.Message);
    throw;
}

in document.Load I'm always getting the following exception:
The magic number in GZip header is not correct. Make sure you are passing in a GZip stream.

What I'm doing wrong?

like image 373
Roee Gavirel Avatar asked Nov 21 '11 13:11

Roee Gavirel


4 Answers

If you have a byte array that contains a zip archive with a single file, you can use the ZipArchive class to get an unzipped byte array with the file's data. ZipArchive is contained in .NET 4.5, in the assembly System.IO.Compression.FileSystem (you need to reference it explicitly).

The following function, adapted from this answer, works for me:

public static byte[] UnzipSingleEntry(byte[] zipped)
{
    using (var memoryStream = new MemoryStream(zipped))
    {
        using (var archive = new ZipArchive(memoryStream))
        {
            foreach (ZipArchiveEntry entry in archive.Entries)
            {
                using (var entryStream = entry.Open())
                {
                    using (var reader = new BinaryReader(entryStream))
                    {
                        return reader.ReadBytes((int)entry.Length);
                    }
                }
            }
        }
    }
    return null; // To quiet my compiler
}
like image 122
Eli_B Avatar answered Oct 12 '22 09:10

Eli_B


Apparently SharpZipLib is now unmaintained and you probably want to avoid it: https://stackoverflow.com/a/593030

In .NET 4.5 there is now built in support for zip files, so for your example it would be:

var data = new WebClient().DownloadData(downloadUrl);

//Create the stream
var stream = new MemoryStream(data);

var document = new XmlDocument();

//zip
var zipArchive = new ZipArchive(stream);

//Load report straight from the zip stream
document.Load(zipArchive.Entries[0].Open());
like image 38
Michael Avatar answered Oct 12 '22 07:10

Michael


I am using SharpZipLib and it's working great !

Below is a function that encapsulate the library

 public static void Compress(FileInfo sourceFile, string destinationFileName,string destinationTempFileName)
        {
            Crc32 crc = new Crc32();
            string zipFile = Path.Combine(sourceFile.Directory.FullName, destinationTempFileName);
            zipFile = Path.ChangeExtension(zipFile, ZIP_EXTENSION);

            using (FileStream fs = File.Create(zipFile))
            {
                using (ZipOutputStream zOut = new ZipOutputStream(fs))
                {
                    zOut.SetLevel(9);
                    ZipEntry entry = new ZipEntry(ZipEntry.CleanName(destinationFileName));

                    entry.DateTime = DateTime.Now;
                    entry.ZipFileIndex = 1;
                    entry.Size = sourceFile.Length;

                    using (FileStream sourceStream = sourceFile.OpenRead())
                    {
                        crc.Reset();
                        long len = sourceFile.Length;
                        byte[] buffer = new byte[bufferSize];
                        while (len > 0)
                        {
                            int readSoFar = sourceStream.Read(buffer, 0, buffer.Length);
                            crc.Update(buffer, 0, readSoFar);
                            len -= readSoFar;
                        }
                        entry.Crc = crc.Value;
                        zOut.PutNextEntry(entry);

                        len = sourceStream.Length;
                        sourceStream.Seek(0, SeekOrigin.Begin);
                        while (len > 0)
                        {
                            int readSoFar = sourceStream.Read(buffer, 0, buffer.Length);
                            zOut.Write(buffer, 0, readSoFar);
                            len -= readSoFar;
                        }
                    }
                    zOut.Finish();
                    zOut.Close();
                }
                fs.Close();
            }
        }
like image 36
Gregory Nozik Avatar answered Oct 12 '22 07:10

Gregory Nozik


As the others have mentioned GZip and Zip are not the same so you might need to use a zip library. I use a library called: DotNetZip - available from the below site:

http://dotnetzip.codeplex.com/

like image 25
Harag Avatar answered Oct 12 '22 08:10

Harag