parallel check md5 file

Question

I have a md5sum file containing lots of lines. I want to use GNU parallel to accelerate the md5sum checking process. In the md5sum, when no file input, it will take the md5 string from stdin. I tried this:

cat checksums.md5 | parallel md5sum -c {}

But getting this error:

md5sum 445350b414a8031d9dd6b1e68a6f2367 testing.gz: No such file or directory

How can I parallel the md5sum checking?

Ole Tange · Accepted Answer

Assuming checksums.md5 has the format:

d41d8cd98f00b204e9800998ecf8427e  My file name

Run:

cat checksums.md5 | parallel --pipe -N1 md5sum -c

If your files are small: -N100

If that does not speed up your processing make sure your disks are fast enough: md5sum can process 500 MB/s. iostat -dkx 1 can tell you if your disks are a bottleneck.

Andrey · Answer

You need option --pipe. In this mode parallel splits stdin into blocks and supplies each block to the command via stdin, see man parallel for details:

cat checksums.md5 | parallel --pipe md5sum -c -

By default size of the block is 1 MB, can be changed with --block option.

parallel check md5 file

Tags:

bash

md5sum

gnu-parallel

Ken

2 Answers

Ole Tange

Andrey

Recent Activity

Donate For Us

parallel check md5 file

Tags:

bash

md5sum

gnu-parallel

Ken

2 Answers

Ole Tange

Andrey

Related questions

Recent Activity

Donate For Us