Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to extract 1 file from a zip file which contain a lot of file?

I tried the java.util.zip package, it is too slow.

Then I found LZMA SDK and 7z jbinding but they are also lacking something.

The LZMA SDK does not provide a kind of documentation/tutorial of how-to-use, it is very frustrating. No javadoc.

While the 7z jbinding does not provide a simple way to extract only 1 file, however, it only provide way to extract all the content of the zip file. Moreover, it does not provide a way to specify a location to place the unzipped file.

Any idea please?

like image 779
lamwaiman1988 Avatar asked Mar 30 '11 08:03

lamwaiman1988


3 Answers

What does your code with java.util.zip look like and how big of a zip file are you dealing with?

I'm able to extract a 4MB entry out of a 200MB zip file with 1,800 entries in roughly a second with this:

OutputStream out = new FileOutputStream("your.file");
FileInputStream fin = new FileInputStream("your.zip");
BufferedInputStream bin = new BufferedInputStream(fin);
ZipInputStream zin = new ZipInputStream(bin);
ZipEntry ze = null;
while ((ze = zin.getNextEntry()) != null) {
    if (ze.getName().equals("your.file")) {
        byte[] buffer = new byte[8192];
        int len;
        while ((len = zin.read(buffer)) != -1) {
            out.write(buffer, 0, len);
        }
        out.close();
        break;
    }
}
like image 193
WhiteFang34 Avatar answered Nov 04 '22 09:11

WhiteFang34


I have not benchmarked the speed but with java 7 or greater, I extract a file as follows.
I would imagine that it's faster than the ZipFile API:

A short example extracting META-INF/MANIFEST.MF from a zip file test.zip:

// file to extract from zip file
String file = "MANIFEST.MF";
// location to extract the file to
File outputLocation = new File("D:/temp/", file);
// path to the zip file
Path zipFile = Paths.get("D:/temp/test.zip");

// load zip file as filesystem
try (FileSystem fileSystem = FileSystems.newFileSystem(zipFile)) {
    // copy file from zip file to output location
    Path source = fileSystem.getPath("META-INF/" + file);
    Files.copy(source, outputLocation.toPath());
}
like image 13
flavio.donze Avatar answered Nov 04 '22 10:11

flavio.donze


Use a ZipFile rather than a ZipInputStream.

Although the documentation does not indicate this (it's in the docs for JarFile), it should use random-access file operations to read the file. Since a ZIPfile contains a directory at a known location, this means a LOT less IO has to happen to find a particular file.

Some caveats: to the best of my knowledge, the Sun implementation uses a memory-mapped file. This means that your virtual address space has to be large enough to hold the file as well as everything else in your JVM. Which may be a problem for a 32-bit server. On the other hand, it may be smart enough to avoid memory-mapping on 32-bit, or memory-map just the directory; I haven't tried.

Also, if you're using multiple files, be sure to use a try/finally to ensure that the file is closed after use.

like image 5
kdgregory Avatar answered Nov 04 '22 09:11

kdgregory