Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Directory Entries in a ZIP File

Tags:

java

zip

I'm writing a Java program which given a qualified-name (like: java.lang.String), fetches and returns the corresponding entry from the src.zip file in the JDK for further processing.

So far my program works fine for any qualified-name which refers to a specific .java source file; but I'm having trouble when the qualified-name refers to a whole package (like: java.util.*). In this case I want my program to return a listing of all entries in the given package.

The problem is it seems there's no way to (efficiently) do such a thing using the utilities provided in the java.util.zip.* package! I have tried both ZipFile and ZipInputStream and none recognize the directories in the src.zip file! They only return entries for individual .java source files!

In code language, both:

ZipEntry entry;
ZipInputStream zip = new ZipInputStream(new FileInputStream("src.zip"));
while((entry = zip.getNextEntry()) != null) 
    System.out.println(entry.isDirectory());

and:

Enumeration<? extends ZipEntry> zip = new ZipFile("src.zip").entries();
while (zip.hasMoreElements()) {
    ZipEntry entry = zip.nextElement();
    System.out.println(entry.isDirectory());
}

always return false; no directories at all!

Even the following code is useless and just returns null (which means Not Found):

ZipFile zipfile = new ZipFile("src.zip");
zipfile.getEntry("java/util/");

A work-around is to use either of the two iterations I listed above and perform an exhaustive search for the desired entries:

if (entry.getName().startsWith("java/util/"))
    System.out.println(entry);

But clearly this is not efficient! Is there no way to either retrieve the entry of a directory in the src.zip file or to efficiently list the entries for a given directory path? Note that I want to directly process the ZIP file without extraction (for obvious reasons).


Update

As it's apparent from the discussion under Timothy Truckle's answer, the above results were achieved using the src.zip file from the latest Oracle JDK at the time of this writing (i.e. JDK-8 update-111). The results differ when using another src.zip file from a different JDK version (e.g. JDK-7 update-80). The credit goes to marabu for pointing out the unzip -l utility in the comments.

Note

While the problem of retrieving directory entries is solved, the problem of efficiently retrieving the list of entries contained inside a given directory path of a ZIP file is not solved. Yet the case is still closed, since according to Timothy Truckle's answer, this cannot be done in any other way than an exhaustive search in the entries, due to limitations of the ZIP format.

like image 305
Seyed Mohammad Avatar asked Oct 30 '22 12:10

Seyed Mohammad


1 Answers

@RC. So your saying that it's related to this specific src.zip file? Because I've seen other ZIP files which do have entries for directories. – Seyed Mohammad

No and Yes. The ZIP file format does only know of files but not directories.

What you may have seen in other zip files is that they included a zero length file named . for each (sub-) folder. But it is not required nor default to do so.

But even if this special entries exist you could not handle them as folders within the ZIP directly (even that all files in the same subfolder are listed consecutively is by accident and neither required nor implied).

like image 118
Timothy Truckle Avatar answered Nov 15 '22 05:11

Timothy Truckle