Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java/zip: Why are .jar files non-deterministically created?

Tags:

java

zip

I never really looked into it but now I realized that I can't easily build two identical .jar files.

I mean, if I build twice, without changing anything, I get the exact same size but different checksums for the .jar.

So I quickly ran some test (basically unzipping, sort -n -k 5'ing and then diff'ing) to see that all the files inside the .jar were identical, yet the .jar were different.

So I did a test with a plain .zip file and found this:

... $ zip 1.zip a.txt
... $ zip 2.zip a.txt
... $ ls -l ?.zip
-rw-rw-r-- 1 webinator webinator 147 2010-07-21 13:09 1.zip
-rw-rw-r-- 1 webinator webinator 147 2010-07-21 13:09 2.zip

(exact same .zip file size)

... $ sha1sum ?.zip
db99f6ad5733c25c0ef1695ac3ca3baf5d5245cf  1.zip
eaf9f0f92eb2ac3e6ac33b44ef45b170f7984a91  2.zip

(different SHA-1 sums, let see why)

$ hexdump 1.zip -C > 1.txt

$ hexdump 2.zip -C > 2.txt

$ diff 1.txt 2.txt 
3c3
< 00000020  74 78 74 55 54 09 00 03  ab d4 46 4c*4e*d5 46 4c  |txtUT.....FLN.FL|
---
> 00000020  74 78 74 55 54 09 00 03  ab d4 46 4c*5d*d5 46 4c  |txtUT.....FL].FL|

Unzipping both zip files surely gives back our unique file.

Question: why is that? (I'll answer myself)

like image 505
SyntaxT3rr0r Avatar asked Jul 21 '10 10:07

SyntaxT3rr0r


People also ask

Are JAR files the same as zip files?

JAR file is a file format based on the popular ZIP file format and is used for aggregating many files into one. A JAR file is essentially a zip file that contains an optional META-INF directory. This all means you can open a jar file using the same tools you use to open a zip file.

Can you rename jar to zip?

You can just uppack the jar. It is just a zip-file. Therefore, you should simply rename/remove the file within the zip.

Can Java read ZIP files?

Java API provides extensive support to read Zip files, all classes related to zip file processing are located in the java. util. zip package. One of the most common tasks related to zip archive is to read a Zip file and display what entries it contains, and then extract them in a folder.


1 Answers

(Answering to myself) It is because the .zip file format saves the creation and modification time in its headers.

If you really do want to create two identical .zip (or .jar), you have to make the second one believe it was created/modified exactly at the same time as the first one.

like image 130
SyntaxT3rr0r Avatar answered Sep 19 '22 15:09

SyntaxT3rr0r