Using Gnome in Linux Mint 12, I copied a Folder of about 9.7 GB (containing a complex tree of subfolders) from one NTFS Flash Drive to another NTFS Flash Drive. According to Gnome the file counts match, but according to du (and other programs) the byte counts don't match. (I've had the same problem copying folders in other Linux distros and Windows XP.)
I only want to know which files don't have matching byte counts. (I don't want to compare the contents of each file, because that would take way too long.) What's the best, easiest and fastest way to find the byte-count-mismatched files?
With the two folders selected, right-click and choose Compare (or click Merge → Compare in the menu).
To see if two folders have the same file, you have to compare them and see if there are any differences. To do this, you can use a file comparison tool such as WinMerge, open it, go to the File tab, choose the folders you want to compare, and hit Compare. How do I sync folders in Windows 10?
By default Notepad++ doesn't have compare function. We can make it possible by easily installing a compare plugin after Notepad++ is installed.
I would adapt the answer by @user1464130 as it has trouble handling spaces in file names.
cd dir1
find . -type f -printf "%p %s\n" | sort > ~/dir1.txt
cd dir2
find . -type f -printf "%p %s\n" | sort > ~/dir2.txt
diff ~/dir1.txt ~/dir2.txt
If you want to launch a command on each file and use the result in the report, you can use the while
Bash construct. This example uses md5sum
to compute a checksum for each file.
find . -maxdepth 1 -type f -printf "%p %s\n" | while read path size; do echo "$path - $(md5sum $path | tr -s " " | cut -f 1 -d " ") - $size" ; done
Each $()
is executed separately and allows us to compute the checksum for each file. The use of tr
squeezes every consecutive spaces into a single space and cut
extracts the word in the n-th position, here in the first position. If we don't do that, we get the name of the file two times because md5sum
give it back on stdout.
Here is an example without using the comparison (no diff
). Note that I've used a dash -
to emphasize the three datas we output about each file but it could be a problem if you want to feed it to another program.
$ find . -maxdepth 1 -name "*.c" -type f -printf "%p %s\n" | while read path size; do echo "$path - $(md5sum $path | tr -s " " | cut -f 1 -d " ") - $size" ; done
./thread.c - 5f2b7b12c7cd12fcb9e9796078e5d15b - 584
./utils.c - d61bc1dbc72768e622a04f03e3b8f7a2 - 3413
EDIT : And to handle spaces in filenames and still get the checksum and the size, you can use the following code.
$ find . -maxdepth 1 -name "*.c" -type f -print0 | xargs -0 -n 1 md5sum | while read checksum path; do echo $path $(stat --printf="%s" "$path") $checksum ; done
./ini tia li za tion.c 84 31626123e9056bac2e96b472bd62f309
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With