How to parallelize for-loop in bash limiting number of processes

Tags:

I have a bash script similar to:

NUM_PROCS=$1
NUM_ITERS=$2

for ((i=0; i<$NUM_ITERS; i++)); do
    python foo.py $i arg2 &
done

What's the most straightforward way to limit the number of parallel processes to NUM_PROCS? I'm looking for a solution that doesn't require packages/installations/modules (like GNU Parallel) if possible.

When I tried Charles Duffy's latest approach, I got the following error from bash -x:

+ python run.py args 1
+ python run.py ... 3
+ python run.py ... 4
+ python run.py ... 2
+ read -r line
+ python run.py ... 1
+ read -r line
+ python run.py ... 4
+ read -r line
+ python run.py ... 2
+ read -r line
+ python run.py ... 3
+ read -r line
+ python run.py ... 0
+ read -r line

... continuing with other numbers between 0 and 5, until too many processes were started for the system to handle and the bash script was shut down.

783

asked Aug 04 '16 18:08

strathallan

Video Answer

5 Answers

A relatively simple way to accomplish this with only two additional lines of code. Explanation is inline.

NUM_PROCS=$1
NUM_ITERS=$2

for ((i=0; i<$NUM_ITERS; i++)); do
    python foo.py $i arg2 &
    let 'i>=NUM_PROCS' && wait -n # wait for one process at a time once we've spawned $NUM_PROC workers
done
wait # wait for all remaining workers

161

answered Oct 13 '22 21:10

rtx13

This isn't the simplest solution, but if your version of bash doesn't have "wait -n" and you don't want to use other programs like parallel, awk etc, here is a solution using while and for loops.

num_iters=10
total_threads=4
iter=1
while [[ "$iter" -lt "$num_iters" ]]; do
    iters_remainder=$(echo "(${num_iters}-${iter})+1" | bc)
    if [[ "$iters_remainder" -lt "$total_threads" ]]; then
        threads=$iters_remainder
    else
        threads=$total_threads
    fi
    for ((t=1; t<="$threads"; t++)); do
        (
            # do stuff
        ) &
        ((++iter))
    done 
    wait
done

answered Oct 04 '22 16:10

Jon

bash 4.4 will have an interesting new type of parameter expansion that simplifies Charles Duffy's answer.

#!/bin/bash

num_procs=$1
num_iters=$2
num_jobs="\j"  # The prompt escape for number of jobs currently running
for ((i=0; i<num_iters; i++)); do
  while (( ${num_jobs@P} >= num_procs )); do
    wait -n
  done
  python foo.py "$i" arg2 &
done

answered Oct 13 '22 21:10

chepner

GNU, macOS/OSX, FreeBSD and NetBSD can all do this with xargs -P, no bash versions or package installs required. Here's 4 processes at a time:

printf "%s\0" {1..10} | xargs -0 -I @ -P 4 python foo.py @ arg2

answered Oct 13 '22 21:10

that other guy

As a very simple implementation, depending on a version of bash new enough to have wait -n (to wait until only the next job exits, as opposed to waiting for all jobs):

#!/bin/bash
#      ^^^^ - NOT /bin/sh!

num_procs=$1
num_iters=$2

declare -A pids=( )

for ((i=0; i<num_iters; i++)); do
  while (( ${#pids[@]} >= num_procs )); do
    wait -n
    for pid in "${!pids[@]}"; do
      kill -0 "$pid" &>/dev/null || unset "pids[$pid]"
    done
  done
  python foo.py "$i" arg2 & pids["$!"]=1
done

If running on a shell without wait -n, one can (very inefficiently) replace it with a command such as sleep 0.2, to poll every 1/5th of a second.

Since you're actually reading input from a file, another approach is to start N subprocesses, each of processes only lines where (linenum % N == threadnum):

num_procs=$1
infile=$2
for ((i=0; i<num_procs; i++)); do
  (
    while read -r line; do
      echo "Thread $i: processing $line"
    done < <(awk -v num_procs="$num_procs" -v i="$i" \
                 'NR % num_procs == i { print }' <"$infile")
  ) &
done
wait # wait for all the $num_procs subprocesses to finish

answered Oct 13 '22 19:10

Charles Duffy

Related questions
                            
                                read in bash on whitespace-delimited file without empty fields collapsing
                            
                                In Git Bash on Windows 7, Colors display as code when running Cucumber or rspec
                            
                                Bash- How to convert non-alphanumerical character to "_"
                            
                                filepath autocompletion using users input
                            
                                Reverse intelligent search (reverse-i-search), how to get previous result? [closed]
                            
                                Bash script to switch the user
                            
                                Count the number of digits in a bash variable
                            
                                Store for loop results as a variable in bash
                            
                                How to use difftool and mergetool on Windows 10 Ubuntu bash (WSL)
                            
                                envsubst: default values for unset variables
                            
                                How to access aws config file from WSL (Windows subsystem for Linux)?
                            
                                picking a random line from stdout
                            
                                Generating hex numbers of a certain range
                            
                                $$ in a script vs $$ in a subshell
                            
                                xargs jar tvf - does not work
                            
                                URL encoding a string in bash script
                            
                                Find and remove DOS line endings on Ubuntu
                            
                                Grep string inside double quotes
                            
                                Scripts for listing all the distinct characters in a text file
                            
                                Tracking work hours through git [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to parallelize for-loop in bash limiting number of processes

Tags:

bash

for-loop

parallel-processing

strathallan

People also ask

Video Answer

5 Answers

rtx13

Jon

chepner

that other guy

Charles Duffy

Recent Activity

Donate For Us