Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Archiving thousands of files and 7zip limitations

Tags:

c#

7zip

My application requires that a task is run everyday in which 100,000+ PDF (~ 50kb each) files need to be zipped. Currently, I'm using 7-zip and calling 7za.exe (the command line tool with 7-zip) to zip each file (files are located in many different folders).

What are the limitations in this approach and how can they be solved? Is there a file size or file count limit for a 7zip archive?

like image 497
Omar Avatar asked Dec 22 '10 21:12

Omar


People also ask

Does 7-Zip have a size limit?

Maximum total size of 7z archive7z file can theoretical be up to 2^64 bytes in size (16 exabytes of total size of the archive), far exceeding current needs and capabilities of filesystems as currently implemented.

Why 7-Zip Cannot open file as archive?

7z' as archive. If you try to open or extract archive and you see the message "Can not open file 'a. 7z' as archive", it means that 7-Zip can't open some header from the start or from the end of archive. In that case you must open archive in hex editor and look to Start Header and End Header.

How many files can be zipped?

For reference purposes, with the Zip64 extension to the Zip file format enhancement, Zip files of 16 exabytes, which is over 16 billion gigabytes (or 2 to the 64th power bytes) are possible. Likewise, over 4 billion files and folders can be included in a Zip file.

Is there anything better than 7-Zip?

WinZip has both a Windows and macOS version available for download. WinZip can compress files in ZIP and, unlike WinRAR or 7-Zip, it can also compress in IHA/IHZ or UUencoded if you run WinZip 12 and higher. WinZip can decompress ZIP, TAR, GZIP, Compress, CAB, RAR, BZ2, IHA/IZH, 7Z, IMG, ISO, XZ, VHD, and VMDK.


1 Answers

The limit on file size is 16 exabytes, or 16000000000 GB.

There is no hard limit on the number of files, but there is a practical limit in how it manages the headers for the files. The exact limit depends on the path lengths but on a 32-bit system you'll run into limits somewhere around a million files.

I'm not sure if any other format supports more. Regular zip has far smaller limits.

http://en.wikipedia.org/wiki/7-Zip

One notable limitation of 7-Zip is that, while it supports file sizes of up to 16 exabytes, it has an unusually high overhead allocating memory for files, on top of the memory requirements for performing the actual compression.

Approximately 1 kilobyte is required per file (More if the pathname is very long) and the file listing alone can grow to an order of magnitude greater than the memory required to do the actual compression. In real world terms, this means 32-bit systems cannot compress more than a million or so files in one archive as the memory requirements exceed the 2 GB process limit.

64-bit systems do not suffer from the same process size limitation, but still require several gigabytes of RAM to overcome this limitation. Archives created on such systems would be unusable on machines with less memory however.

like image 77
Samuel Neff Avatar answered Oct 03 '22 16:10

Samuel Neff