Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Zip files contain same files but have different hashes?

Tags:

php

zip

md5-file

I have created hundreds of folders and text files using php, I then add them to a zip archive.

This all works fine but if I create another zip archive using the same folders and files, the new archive will have a different hash to the first one. This is the same if I use winrar instead of php to create an archive.

It only seems to show different hashes when I zip the files I have created through php, yet they open fine.

Very strange anyone shed any light on this?

Thanks

like image 753
arbme Avatar asked Jul 22 '12 20:07

arbme


2 Answers

Zip is not deterministic. To solve this zip problem (it's really problem when you have CI and need to update AWS lambda, for example and don't want to update it each time, but only when something was really changed) I used this article: https://medium.com/@pat_wilson/building-deterministic-zip-files-with-built-in-commands-741275116a19
Like this:

find . -exec touch -t "$(git ls-files -z . | \
  xargs -0 -n1 -I{} -- git log -1 --date=format:"%Y%m%d%H%M" --format="%ad" '{}' | \
  sort -r | head -n 1)" '{}' +
zip -rq -D -X -9 -A --compression-method deflate dest.zip sources...
like image 59
Dima Kurilo Avatar answered Nov 15 '22 20:11

Dima Kurilo


There is certainly some difference in the files. If the lengths are not exactly the same, the hash will be different. You can use a comparing hex editor, like Hex Workshop for example, to see what exactly the differences are.

Possibilities that come to my mind:

  1. As @orn mentioned, there may be a timestamp in the zip format you are using (not sure).
  2. The order that the files are added to the archive may be different (depending on how you're selecting them / building the source array).
like image 22
Jonathon Reinhart Avatar answered Nov 15 '22 19:11

Jonathon Reinhart