I have a md5sum file containing lots of lines. I want to use GNU parallel to accelerate the md5sum checking process. In the md5sum, when no file input, it will take the md5 string from stdin. I tried this:
cat checksums.md5 | parallel md5sum -c {}
But getting this error:
md5sum 445350b414a8031d9dd6b1e68a6f2367 testing.gz: No such file or directory
How can I parallel the md5sum checking?
Assuming checksums.md5 has the format:
d41d8cd98f00b204e9800998ecf8427e My file name
Run:
cat checksums.md5 | parallel --pipe -N1 md5sum -c
If your files are small: -N100
If that does not speed up your processing make sure your disks are fast enough: md5sum can process 500 MB/s. iostat -dkx 1
can tell you if your disks are a bottleneck.
You need option --pipe
. In this mode parallel splits stdin into blocks and supplies each block to the command via stdin, see man parallel
for details:
cat checksums.md5 | parallel --pipe md5sum -c -
By default size of the block is 1 MB, can be changed with --block
option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With