I've been trying to get a for loop to run a bunch of commands sort of simultaneously and was attempting to do it via subshells. Ive managed to cobble together the script below to test and it seems to work ok.
#!/bin/bash
for i in {1..255}; do
(
#commands
)&
done
wait
The only problem is that my actual loop is going to be for i in files* and then it just crashes, i assume because its started too many subshells to handle. So i added
#!/bin/bash
for i in files*; do
(
#commands
)&
if (( $i % 10 == 0 )); then wait; fi
done
wait
which now fails. Does anyone know a way around this? Either using a different command to limit the number of subshells or provide a number for $i?
Cheers
Another solution would be to use tools designed for concurrency:
printf '%s\0' files* | xargs -0 -P6 -n1 yourScript
The -P6
is the maximum number of concurrent processes that xargs
will launch. Make it 10 if you like.
I suggest xargs
because it is likely already on your system. If you want a really robust solution, look at GNU Parallel.
For another answer explicit to your question: Get the counter as the array index?
files=( files* )
for i in "${!files[@]}"; do
commands "${files[i]}" &
(( i % 10 )) || wait
done
(The parentheses around the compound command aren't important because backgrounding the job will have the same effects as using a subshell anyway.)
Just different semantics:
simultaneous() {
while [[ $1 ]]; do
for i in {1..11}; do
[[ ${@:i:1} ]] || break
commands "${@:i:1}" &
done
shift 10 || shift "$#"
wait
done
}
simultaneous files*
You can find useful to count the number of jobs with jobs
. e.g.:
wc -w <<<$(jobs -p)
So, your code would look like this:
#!/bin/bash
for i in files*; do
(
#commands
)&
if (( $(wc -w <<<$(jobs -p)) % 10 == 0 )); then wait; fi
done
wait
As @chepner suggested:
In bash 4.3, you can use wait -n
to proceed as soon as any job completes, rather than waiting for all of them
Define the counter explicitly
#!/bin/bash
for f in files*; do
(
#commands
)&
(( i++ % 10 == 0 )) && wait
done
wait
There's no need to initialize i
, as it will default to 0 the first time you use it. There's also no need to reset the value, as i %10
will be 0 for i=10, 20, 30, etc.
If you have Bash≥4.3, you can use wait -n
:
#!/bin/bash
max_nb_jobs=10
for i in file*; do
# Wait until there are less than max_nb_jobs jobs running
while mapfile -t < <(jobs -pr) && ((${#MAPFILE[@]}>=max_nb_jobs)); do
wait -n
done
{
# Your commands here: no useless subshells! use grouping instead
} &
done
wait
If you don't have wait -n
available, you can use something like this:
#!/bin/bash
set -m
max_nb_jobs=10
sleep_jobs() {
# This function sleeps until there are less than $1 jobs running
local n=$1
while mapfile -t < <(jobs -pr) && ((${#MAPFILE[@]}>=n)); do
coproc read
trap "echo >&${COPROC[1]}; trap '' SIGCHLD" SIGCHLD
[[ $COPROC_PID ]] && wait $COPROC_PID
done
}
for i in files*; do
# Wait until there are less than 10 jobs running
sleep_jobs "$max_nb_jobs"
{
# Your commands here: no useless subshells! use grouping instead
} &
done
wait
The advantage of proceeding like this, is that we make no assumptions on the time taken to finish the jobs. A new job is launched as soon as there's room for it. Moreover, it's all pure Bash, so doesn't rely on external tools and (maybe more importantly), you may use your Bash environment (variables, functions, etc.) without exporting them (arrays can't be easily exported so that can be a huge pro).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With