Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does one find the start of the "Central Directory" in zip files?

Wikipedia has an excellent description of the ZIP file format, but the "central directory" structure is confusing to me. Specifically this:

This ordering allows a ZIP file to be created in one pass, but it is usually decompressed by first reading the central directory at the end.

The problem is that even the trailing header for the central directory is variable length. How then, can someone get the start of the central directory to parse?

(Oh, and I did spend some time looking at APPNOTE.TXT in vain before coming here and asking :P)

like image 259
Billy ONeal Avatar asked Jan 26 '11 07:01

Billy ONeal


People also ask

What is central directory in zip?

Zip file reader Basically what it does is download the central directory part of the . zip file which resides in the end of the file. Then it will read out every file and folder name with it's path from the bytes and print it out to console.

What indicates the start of a ZIP file?

Conventionally the first thing in a ZIP file is a ZIP entry, which can be identified easily by its local file header signature. However, this is not necessarily the case, as this not required by the ZIP specification - most notably, a self-extracting archive will begin with an executable file header.

What is central directory?

SC27-3672-01. A central directory server (CDS) is a network node that builds and maintains a directory of resources throughout the network.


2 Answers

My condolences, reading the wikipedia description gives me the very strong impression that you need to do a fair amount of guess + check work:

Hunt backwards from the end for the 0x06054b50 end-of-directory tag, look forward 16 bytes to find the offset for the start-of-directory tag 0x02014b50, and hope that is it. You could do some sanity checks like looking for the comment length and comment string tags after the end-of-directory tag, but it sure feels like Zip decoders work because people don't put funny characters into their zip comments, filenames, and so forth. Based entirely on the wikipedia page, anyhow.

like image 124
sarnold Avatar answered Sep 22 '22 19:09

sarnold


I was implementing zip archive support some time ago, and I search last few kilobytes for a end of central directory signature (4 bytes). That works pretty good, until somebody will put 50kb text into comment (which is unlikely to happen. To be absolutely sure, you can search last 64kb + few bytes, since comment size is 16 bit). After that, I look up for zip64 end of central dir locator, that's easier since it has fixed structure.

like image 38
Nickolay Olshevsky Avatar answered Sep 22 '22 19:09

Nickolay Olshevsky