I am currently working on a tool that uploads a group of files, then uses md5 checksums to compare the files to the last batch that were uploaded and tells you which files have changed.
For regular files this is working fine but some of the uploaded files are zip archives, which almost always have changed, even when the files inside it are the same.
Is there a way to perform a different type of checksum to check if these files have changed without having to unzip each one individually and then comparing the contents of each file individually.
Here is my current function
function check_if_changed($date, $folder, $filename)
{
$dh = opendir('./wp-content/uploads/Base/');
while (($file = readdir($dh)) !== false) {
$folders[] = $file;
}
sort($folders);
$position = array_search($date, $folders);
$prev_folder = $folders[$position - 1];
if ($prev_folder == '.' || $prev_folder == '..')
{ return true;}
$newhash = md5_file('./wp-content/uploads/Base/'.$date.'/'.$folder.'/'.$filename);
$oldhash = md5_file('./wp-content/uploads/Base/'.$prev_folder.'/'.$folder.'/'.$filename);
if ($oldhash != $newhash){
return true;
}
return false;
}
Inside a zip archive, each "file" is stored with meta data like last modifcation time, filename, filesize in bytes, etc...and the important part - a crc32 checksum.
basically, you can just operate on the zip archive in a binary fashion, finding each file's meta data header and comparing the checksum to the previously stored checksums. You don't need to do any uncompressing to access the meta data in a zip archive. This would be extremely fast.
http://en.wikipedia.org/wiki/Zip_(file_format)
edit- actually, ZipArchive offers this functionality. See: http://www.php.net/manual/en/ziparchive.statindex.php
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With