I want to tell whether two tarball files contain identical files, in terms of file name and file content, not including meta-data like date, user, group.
However, There are some restrictions: first, I have no control of whether the meta-data is included when making the tar file, actually, the tar file always contains meta-data, so directly diff the two tar files doesn't work. Second, since some tar files are so large that I cannot afford to untar them in to a temp directory and diff the contained files one by one. (I know if I can untar file1.tar into file1/, I can compare them by invoking 'tar -dvf file2.tar' in file/. But usually I cannot afford untar even one of them)
Any idea how I can compare the two tar files? It would be better if it can be accomplished within SHELL scripts. Alternatively, is there any way to get each sub-file's checksum without actually untar a tarball?
Thanks,
you can use cksum command to see if the checksum of the tar files are same. But for identifying the changed files, you have to extract the tar(tar -xvf) and compare the files.
execute a tar tvf to list the contents of each file and store the outputs in two different files. then, slice out everything besides the filename and size columns. Preferably sort the two files too. Then, just do a file diff between the two lists.
Try also pkgdiff to visualize differences between packages (detects added/removed/renamed files and changed content, exist with zero code if unchanged):
pkgdiff PKG-0.tgz PKG-1.tgz
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With