I am running a diff on two directories, recursively, with a few options. The directories are somewhat large, however, I am trying to just see the differences in the contents of folders, not between the files, using the -q option (am i using this right?)
I have also tried rsync dry run, that seems to take equally as long. The output goes through sed, I have tried without, it doesn't seem to effect anything. I also ignore hidden files. I think I may be mis-using diff -q to just compare the contents of 2 directories.
I used a code block from another tip to time how long just comparing ONE of these directories was (1 directory, 14 subdirectories) and it took 88 minutes. However, every file was a 30 minutes long TV-show, so if diff is comparing these files, that makes sense, but I thought that -q would cause that to not happen?
Also, one directory is mounted over AFP, one is a firewire connected external drive. This doesn't matter, because I copied both directories locally and the diff took the same amount of time.
I have a solution to this - I ran ls -1 over both directories and diff'd the output - but why is diff taking so long to run?
Here is the code; any suggestions?
#!/bin/bash
before="$(date +%s)"
diff -r -x '.*' /Volumes/directory1/ /Volumes/directory2/ | sed 's/^.\{24\}//g' > /Volumes/stuff.txt
diff -r -x '.*' /Volumes/directory3/ /Volumes/directory4/ | sed 's/^.\{24\}//g' > /Volumes/stuff.txt
diff -r -x '.*' /Volumes/directory5/ /Volumes/directory6/ | sed 's/^.\{24\}//g' > /Volumes/stuff.txt
diff -r -x '.*' /Volumes/directory7/ /Volumes/directory8/ | sed 's/^.\{24\}//g' > /Volumes/stuff.txt
diff -r -x '.*' /Volumes/directory9/ /Volumes/directory10/ | sed 's/^.\{24\}//g' > /Volumes/stuff.txt
diff -r -x '.*' /Volumes/directory11/ /Volumes/directory12/ | sed 's/^.\{24\}//g' > /Volumes/stuff.txt
after="$(date +%s)"
elapsed_seconds="$(expr $after - $before)"
echo Elapsed time for code block: $elapsed_seconds
When files are different diff
will be able to figure that out fairly quickly. When they're the same, though, it has to scan the files in full to verify that they are indeed byte-for-byte identical.
If all you care about is differences in file names and don't want to inspect the contents of the files, try something like:
diff <(find /Volumes/directory1/ -printf '%P\n') \
<(find /Volumes/directory2/ -printf '%P\n')
This assumes you have GNU find with the -printf
action. If you don't, use some subshell magic per Gordon's comment:
diff <(cd /Volumes/directory1; find .) \
<(cd /Volumes/directory2; find .)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With