Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash: how to simply parallelize tasks?

I'm writing a tiny script that calls the "PNGOUT" util on a few hundred PNG files. I simply did this:

find $BASEDIR -iname "*png" -exec pngout {} \;

And then I looked at my CPU monitor and noticed only one of the core was used, which is quite sad.

In this day and age of dual, quad, octo and hexa (?) cores desktop, how do I simply parallelize this task with Bash? (it's not the first time I've had such a need, for quite a lot of these utils are mono-threaded... I already had the case with mp3 encoders).

Would simply running all the pngout in the background do? How would my find command look like then? (I'm not too sure how to mix find and the '&' character)

I if have three hundreds pictures, this would mean swapping between three hundreds processes, which doesn't seem great anyway!?

Or should I copy my three hundreds files or so in "nb dirs", where "nb dirs" would be the number of cores, then run concurrently "nb finds"? (which would be close enough)

But how would I do this?

like image 538
NoozNooz42 Avatar asked Jun 09 '10 01:06

NoozNooz42


People also ask

How do you parallelize a script?

The general way to parallelize any operation is to take a particular function that should be run multiple times and make it run parallelly in different processors. To do this, you initialize a Pool with n number of processors and pass the function you want to parallelize to one of Pool s parallization methods.

How do you run multiple scripts one after another but only after previous one got completed?

Using wait. We can launch a script in the background initially and later wait for it to finish before executing another script using the wait command. This command works even if the process exits with a non-zero failure code.


1 Answers

Answering my own question... It turns out there's a relatively unknown feature of the xargs command that can be used to accomplish that:

find . -iname "*png" -print0 | xargs -0 --max-procs=4 -n 1 pngout

Bingo, instant 4x speedup on a quad-cores machine :)

like image 141
NoozNooz42 Avatar answered Sep 28 '22 08:09

NoozNooz42