Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimal "Task Queue" with stock Linux tools to leverage Multicore CPU

What is the best/easiest way to build a minimal task queue system for Linux using bash and common tools?

I have a file with 9'000 lines, each line has a bash command line, the commands are completely independent.

command 1 > Logs/1.log
command 2 > Logs/2.log
command 3 > Logs/3.log
...

My box has more than one core and I want to execute X tasks at the same time. I searched the web for a good way to do this. Apparently, a lot of people have this problem but nobody has a good solution so far.

It would be nice if the solution had the following features:

  • can interpret more than one command (e.g. command; command)
  • can interpret stream redirects on the lines (e.g. ls > /tmp/ls.txt)
  • only uses common Linux tools

Bonus points if it works on other Unix-clones without too exotic requirements.

like image 232
Manuel Avatar asked May 06 '09 23:05

Manuel


2 Answers

Can you convert your command list to a Makefile? If so, you could just run "make -j X".

like image 67
Gerald Combs Avatar answered Nov 13 '22 18:11

Gerald Combs


GNU Parallel http://www.gnu.org/software/parallel/ is a more general tool for parallelizing than PPSS.

If runfile contains:

command 1 > Logs/1.log
command 2 > Logs/2.log
command 3 > Logs/3.log

you can do:

cat runfile | parallel -j+0

which will run one command per CPU core.

If your commands are as simple as above you do not even need runfile but can do:

seq 1 3 | parallel -j+0 'command {} > Logs/{}.log'

If you have more computers available to do the processing you may want to look at the --sshlogin and --trc options for GNU Parallel.

like image 44
Ole Tange Avatar answered Nov 13 '22 19:11

Ole Tange