The java.nio
package has a beautiful way of handling zip files by treating them as file systems. This enables us to treat zip file contents like usual files. Thus, zipping a whole folder can be achieved by simply using Files.copy
to copy all the files into the zip file. Since subfolders are to be copied as well, we need a visitor:
private static class CopyFileVisitor extends SimpleFileVisitor<Path> {
private final Path targetPath;
private Path sourcePath = null;
public CopyFileVisitor(Path targetPath) {
this.targetPath = targetPath;
}
@Override
public FileVisitResult preVisitDirectory(final Path dir,
final BasicFileAttributes attrs) throws IOException {
if (sourcePath == null) {
sourcePath = dir;
} else {
Files.createDirectories(targetPath.resolve(sourcePath
.relativize(dir).toString()));
}
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFile(final Path file,
final BasicFileAttributes attrs) throws IOException {
Files.copy(file,
targetPath.resolve(sourcePath.relativize(file).toString()), StandardCopyOption.REPLACE_EXISTING);
return FileVisitResult.CONTINUE;
}
}
This is a simple "copy directory recursively" visitor. It is used to copy a directory recursively. However, with the ZipFileSystem
, we can also use it to copy a directory into a zip file, like this:
public static void zipFolder(Path zipFile, Path sourceDir) throws ZipException, IOException
{
// Initialize the Zip Filesystem and get its root
Map<String, String> env = new HashMap<>();
env.put("create", "true");
URI uri = URI.create("jar:" + zipFile.toUri());
FileSystem fileSystem = FileSystems.newFileSystem(uri, env);
Iterable<Path> roots = fileSystem.getRootDirectories();
Path root = roots.iterator().next();
// Simply copy the directory into the root of the zip file system
Files.walkFileTree(sourceDir, new CopyFileVisitor(root));
}
This is what I call an elegant way of zipping a whole folder. However, when using this method on a huge folder (around 3 GB) I receive an OutOfMemoryError
(heap space). When using a usual zip handling library, this error is not raised. Thus, it seems that the way the ZipFileSystem
handles the copy is very inefficient: Too much of the files to be written is kept in memory so the OutOfMemoryError
occurs.
Why is this the case? Is using ZipFileSystem
generally considered inefficient (in terms of memory consumption) or am I doing something wrong here?
I looked at ZipFileSystem.java and I believe I found the source of the memory consumption. By default, the implementation is using ByteArrayOutputStream
as the buffer to compress the files, which means that it's limited by the amount of memory assigned to the JVM.
There's an (undocumented) environment variable we can use to make the implementation use temporary files ("useTempFile"
). It works like this:
Map<String, Object> env = new HashMap<>();
env.put("create", "true");
env.put("useTempFile", Boolean.TRUE);
More details here: http://www.docjar.com/html/api/com/sun/nio/zipfs/ZipFileSystem.java.html, interesting lines are 96, 1358 and 1362.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With