Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract a single file from a remote archive file?

Given

  1. URL of an archive (e.g. a zip file)
  2. Full name (including path) of a file inside that archive

I'm looking for a way (preferably in Java) to create a local copy of that file, without downloading the entire archive first.

From my (limited) understanding it should be possible, though I have no idea how to do that. I've been using TrueZip, since it seems to support a large variety of archive types, but I have doubts about its ability to work in such a way. Does anyone have any experience with that sort of thing?

EDIT: being able to also do that with tarballs and zipped tarballs is also important for me.

like image 892
Oak Avatar asked Jun 26 '10 23:06

Oak


1 Answers

Well, at a minimum, you have to download the portion of the archive up to and including the compressed data of the file you want to extract. That suggests the following solution: open a URLConnection to the archive, get its input stream, wrap it in a ZipInputStream, and repeatedly call getNextEntry() and closeEntry() to iterate through all the entries in the file until you reach the one you want. Then you can read its data using ZipInputStream.read(...).

The Java code would look something like this:

URL url = new URL("http://example.com/path/to/archive");
ZipInputStream zin = new ZipInputStream(url.getInputStream());
ZipEntry ze = zin.getNextEntry();
while (!ze.getName().equals(pathToFile)) {
    zin.closeEntry(); // not sure whether this is necessary
    ze = zin.getNextEntry();
}
byte[] bytes = new byte[ze.getSize()];
zin.read(bytes);

This is, of course, untested.

like image 92
David Z Avatar answered Nov 15 '22 17:11

David Z