I'm writing a tiny script that calls the "PNGOUT" util on a few hundred PNG files. I simply did this: <pre class="prettyprint"><code>find $BASEDIR -iname "*png" -exec pngout {} \; </code></pre> And then I looked at my CPU monitor and noticed only one of the core was used, which is quite sad. In this day and age of dual, quad, octo and hexa (?) cores desktop, how do I simply parallelize this task with Bash? (it's not the first time I've had such a need, for quite a lot of these utils are mono-threaded... I already had the case with mp3 encoders). Would simply running all the pngout in the background do? How would my find command look like then? (I'm not too sure how to mix find and the '&' character) I if have three hundreds pictures, this would mean swapping between three hundreds processes, which doesn't seem great anyway!? Or should I copy my three hundreds files or so in "nb dirs", where "nb dirs" would be the number of cores, then run concurrently "nb finds"? (which would be close enough) But how would I do this?

Answering my own question... It turns out there's a relatively unknown feature of the xargs command that can be used to accomplish that: <pre class="prettyprint"><code>find . -iname "*png" -print0 | xargs -0 --max-procs=4 -n 1 pngout </code></pre> Bingo, instant 4x speedup on a quad-cores machine :)

Bash: how to simply parallelize tasks?

Tags:

bash

concurrency

I'm writing a tiny script that calls the "PNGOUT" util on a few hundred PNG files. I simply did this:

Click to copy

find $BASEDIR -iname "*png" -exec pngout {} \;

And then I looked at my CPU monitor and noticed only one of the core was used, which is quite sad.

In this day and age of dual, quad, octo and hexa (?) cores desktop, how do I simply parallelize this task with Bash? (it's not the first time I've had such a need, for quite a lot of these utils are mono-threaded... I already had the case with mp3 encoders).

Would simply running all the pngout in the background do? How would my find command look like then? (I'm not too sure how to mix find and the '&' character)

I if have three hundreds pictures, this would mean swapping between three hundreds processes, which doesn't seem great anyway!?

Or should I copy my three hundreds files or so in "nb dirs", where "nb dirs" would be the number of cores, then run concurrently "nb finds"? (which would be close enough)

But how would I do this?

538

asked Jun 09 '10 01:06

NoozNooz42

1 Answers

Answering my own question... It turns out there's a relatively unknown feature of the xargs command that can be used to accomplish that:

Click to copy

find . -iname "*png" -print0 | xargs -0 --max-procs=4 -n 1 pngout

Bingo, instant 4x speedup on a quad-cores machine :)

141

answered Sep 28 '22 08:09

NoozNooz42

Related questions
                            
                                Bash command that prints a message on stderr
                            
                                bash cat multiple files content in to single string without newlines
                            
                                git stderr output can't pipe
                            
                                Passing bash variables to a script?
                            
                                How to make one line for loop in android shell
                            
                                how to set unix/bash as default language on notepad++ 6.5.5
                            
                                alternative to readarray, because it does not work on mac os x
                            
                                Hide error message in bash
                            
                                script to download file from Amazon S3 bucket
                            
                                Bash -- find a list of files with more than 3 lines
                            
                                Why is `find -depth 1` so slow to list directories?
                            
                                How to repeat a dash (hyphen) in shell
                            
                                Append elements to an array in bash
                            
                                *nix: perform set union/intersection/difference of lists
                            
                                Error with bash script "exit code 126"
                            
                                Why 'getopts' within a function fails to work?
                            
                                Unix Bash Shell Programming if directory exists
                            
                                Makefile syntax: what is $(RM)?
                            
                                How to know the process id of current bash session? [duplicate]
                            
                                Why can't supervisor find command source

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Bash: how to simply parallelize tasks?

Tags:

bash

concurrency

NoozNooz42

People also ask

1 Answers

NoozNooz42

Recent Activity

Donate For Us