I've thousands of png files which I like to make smaller with pngcrush
. I've a simple find .. -exec
job, but it's sequential. My machine has quite some resources and I'd make this in parallel.
The operation to be performed on every png is:
pngcrush input output && mv output input
Ideally I can specify the maximum number of parallel operations.
Is there a way to do this with bash and/or other shell helpers? I'm Ubuntu or Debian.
If your machine has at least two CPU threads, you will be able to max-out CPU resources using multi-threaded scripting in Bash. The reason for this is simple; as soon as a secondary 'thread' (read: subshell) is started, then that subsequent thread can (and often will) use a different CPU thread.
To run script in parallel in bash, you must send individual scripts to background. So the loop will not wait for the last process to exit and will immediately process all the scripts.
The next method that we can use to run processes in parallel is our regular xargs command. Xargs supports an option to specify the number of processes that you want to run simultaneously. See below. seq command will simply give 1, 2, and 3 as output in three lines.
You can use xargs
to run multiple processes in parallel:
find /path -print0 | xargs -0 -n 1 -P <nr_procs> sh -c 'pngcrush $1 temp.$$ && mv temp.$$ $1' sh
xargs
will read the list of files produced by find (separated by 0 characters (-0
)) and run the provided command (sh -c '...' sh
) with one parameter at a time (-n 1
). xargs will run <nr_procs>
(-P <nr_procs>
) in parallel.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With