Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I reduce the time taken to extract files?

I made a program in C# where it processes about 30 zipped folders which have about 35000 files in total. My purpose is to read every single file for processing its information. As of now, my code extracts all the folders and then read the files. The problem with this process is it takes about 15-20 minutes for it to happen, which is a lot.

I am using the following code to extract files:

void ExtractFile(string zipfile, string path)
{
    ZipFile zip = ZipFile.Read(zipfile);
    zip.ExtractAll(path);
}

The extraction part is the one which takes the most time to process. I need to reduce this time. Is there a way I can read the contents of the files inside the zipped folder without extracting them? or if anyone knows any other way that can help me reduce the time of this code ?

Thanks in advance

like image 449
user2945623 Avatar asked Jan 27 '14 17:01

user2945623


2 Answers

You could try reading each entry into a memory stream instead of to the file system:

ZipFile zip = ZipFile.Read(zipfile);
foreach(ZipEntry entry in zip.Entries)
{
    using(MemoryStream ms = new MemoryStream())
    {
        entry.Extract(ms);
        ms.Seek(0,SeekOrigin.Begin);
        // read from the stream
    }

}
like image 190
D Stanley Avatar answered Sep 22 '22 14:09

D Stanley


Maybe instead of extracting it to the hard disk, you should try read it without extraction, using OpenRead, then you would have to use the ZipArchiveEntry.Open method.

Also have a look at the CodeFluent Runtime tool, which claims to be improved for performances issues.

like image 41
cubitouch Avatar answered Sep 21 '22 14:09

cubitouch