Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running shell script in parallel

I have a shell script which

  1. shuffles a large text file (6 million rows and 6 columns)
  2. sorts the file based the first column
  3. outputs 1000 files

So the pseudocode looks like this

file1.sh   #!/bin/bash for i in $(seq 1 1000) do    Generating random numbers here , sorting  and outputting to file$i.txt    done 

Is there a way to run this shell script in parallel to make full use of multi-core CPUs?

At the moment, ./file1.sh executes in sequence 1 to 1000 runs and it is very slow.

Thanks for your help.

like image 265
Tony Avatar asked Apr 05 '11 05:04

Tony


People also ask

Do bash scripts run in parallel?

To run script in parallel in bash, you must send individual scripts to background. So the loop will not wait for the last process to exit and will immediately process all the scripts.

How do I run multiple scripts in parallel Linux?

Method #1: Using the Semicolon Operator Here, you can have as many commands as you want to run in parallel separated by semicolons.


2 Answers

Another very handy way to do this is with gnu parallel, which is well worth installing if you don't already have it; this is invaluable if the tasks don't necessarily take the same amount of time.

seq 1000 | parallel -j 8 --workdir $PWD ./myrun {} 

will launch ./myrun 1, ./myrun 2, etc, making sure 8 jobs at a time are running. It can also take lists of nodes if you want to run on several nodes at once, eg in a PBS job; our instructions to our users for how to do that on our system are here.

Updated to add: You want to make sure you're using gnu-parallel, not the more limited utility of the same name that comes in the moreutils package (the divergent history of the two is described here.)

like image 113
Jonathan Dursi Avatar answered Sep 22 '22 10:09

Jonathan Dursi


Check out bash subshells, these can be used to run parts of a script in parallel.

I haven't tested this, but this could be a start:

#!/bin/bash for i in $(seq 1 1000) do    ( Generating random numbers here , sorting  and outputting to file$i.txt ) &    if (( $i % 10 == 0 )); then wait; fi # Limit to 10 concurrent subshells. done wait 
like image 25
Anders Lindahl Avatar answered Sep 23 '22 10:09

Anders Lindahl