I made a program in C# where it processes about 30 zipped folders which have about 35000 files in total. My purpose is to read every single file for processing its information. As of now, my code extracts all the folders and then read the files. The problem with this process is it takes about 15-20 minutes for it to happen, which is a lot.
I am using the following code to extract files:
void ExtractFile(string zipfile, string path)
{
ZipFile zip = ZipFile.Read(zipfile);
zip.ExtractAll(path);
}
The extraction part is the one which takes the most time to process. I need to reduce this time. Is there a way I can read the contents of the files inside the zipped folder without extracting them? or if anyone knows any other way that can help me reduce the time of this code ?
Thanks in advance
You could try reading each entry into a memory stream instead of to the file system:
ZipFile zip = ZipFile.Read(zipfile);
foreach(ZipEntry entry in zip.Entries)
{
using(MemoryStream ms = new MemoryStream())
{
entry.Extract(ms);
ms.Seek(0,SeekOrigin.Begin);
// read from the stream
}
}
Maybe instead of extracting it to the hard disk, you should try read it without extraction, using OpenRead, then you would have to use the ZipArchiveEntry.Open method.
Also have a look at the CodeFluent Runtime tool, which claims to be improved for performances issues.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With