I had a small script where I would source into each openstack's tenant and fetch some output with the help of python. It took too long for the reports to get generated and I was suggested to use xargs
. My earlier code was like below.
#!/bin/bash
cd /scripts/cloud01/floating_list
rm -rf ./reports/openstack_reports/
mkdir -p ./reports/openstack_reports/
source ../creds/base
for tenant in A B C D E F G H I J K L M N O P Q R S T
do
source ../creds/$tenant
python ../tools/openstack_resource_list.py > ./reports/openstack_reports/$tenant.html
done
lftp -f ./lftp_script
Now I have put xargs in the script and the script looks something like this.
#!/bin/bash
cd /scripts/cloud01/floating_list
rm -rf ./reports/openstack_reports/
mkdir -p ./reports/openstack_reports/
source ../creds/base
# Need xargs idea below
cat tenants_list.txt | xargs -P 8 -I '{}' # something that takes the tenant name and source
TENANT_NAME={}
python ../tools/openstack_resource_list.py > ./reports/openstack_reports/$tenant.html
lftp -f ./lftp_script
In this script how am I supposed to implement source ../creds/$tenant
? Because while each tenant is dealt with, it needs to be sourced as well and I am not sure how to include that with xargs for parallel execution.
xargs will run the first two commands in parallel, and then whenever one of them terminates, it will start another one, until the entire job is done. The same idea can be generalized to as many processors as you have handy. It also generalizes to other resources besides processors.
The xargs command builds and executes commands provided through the standard input. It takes the input and converts it into a command argument for another command. This feature is particularly useful in file management, where xargs is used in combination with rm , cp , mkdir , and other similar commands.
xargs is a Unix command which can be used to build and execute commands from standard input.
To run multiple commands with xargs , use the -I option. It works by defining a replace-str after the -I option and all occurrences of the replace-str are replaced with the argument passed to xargs.
xargs
can't easily run a shell function ... but it can run a shell.
# If the tenant names are this simple, don't put them in a file
printf '%s\n' {A..T} |
xargs -P 8 -I {} bash -c 'source ../creds/"$0"
python ../tools/openstack_resource_list.py > ./reports/openstack_reports/"$0".html' {}
Somewhat obscurely, the argument after bash -c '...'
gets exposed as $0
inside the script.
If you want to keep the tenants in a file, xargs -a filename
is a good way to avoid the useless use of cat
, though it's not portable to all xargs
implementations. (Redirecting with xargs ... <filename
is obviously completely portable.)
For efficiency, you could refactor the script to loop over as many arguments as possible:
printf '%s\n' {A..T} |
xargs -n 3 -P 8 bash -c 'for tenant; do
source ../creds/"$tenant"
python ../tools/openstack_resource_list.py > ./reports/openstack_reports/"$tenant".html
done' _
This will run a maximum of 8 parallel shell instances with a maximum of 3 tenants assigned to each (so in actual fact only 7 instances), though with this small number of arguments, the difference in performance is probably negligible.
Because we are now actually receiving a list of arguments, we pass _
as the value to populate $0
with (just because it needs to be set to something, in order to get the real arguments in place properly).
If the source
might make modifications which are not always guaranteed to be overwritten by the source
in the next iteration (say, some tenants have variables which need to be unset for some other tenants?) that complicates matters, but maybe post a separate question if you really actually need help resolving that; or just fall back to the first variant where each tenant is run in a separate shell instance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With