Well, I am trying to run serial MPI jobs masked as a one job on our supercomputer. The main submission script basically looks like that:
#!/bin/bash -l
#PBS -l nodes=4:ppn=8,walltime=24:00:00
cat $PBS_NODEFILE | uniq | tr '\\012' ' ' > tmp-$PBS_JOBID
read -a NODE < tmp-$PBS_JOBID
rm tmp-$PBS_JOBID
inode=-1
ijob=0
for ((K=1;K<=8;K++))
do
[ $((ijob++ % 2)) -eq 0 ] && ((inode++))
ssh ${NODE[inode]} _somepath_/RUN$K/sub.script &
done
wait
exit 0
Each sub.script looks like:
#!/bin/bash -l
#PBS -l walltime=24:00:00,nodes=1:ppn=4
module load intel
module load ompi
export FORT_BUFFERED=1
*run executable*
wait
exit 0
And sometimes I encounter an error for each sub.script (jobs die immediately):
/bin/bash: -
: invalid option
Usage: /bin/bash [GNU long option] [option] ...
/bin/bash [GNU long option] [option] script-file ...
*etc.*
The most interesting thing is that it is a random error meaning if I run the same script for the second (or 3rd etc.) time it will run without any problems. Sometimes I'm lucky, sometimes I'm not... Removing -l won't help because in that case modules cannot be loaded and mpirun won't work. Any suggestions how to fix it?
Thanks a lot in advance!
/bin/bash is the most common shell used as default shell for user login of the linux system. The shell's name is an acronym for Bourne-again shell. Bash can execute the vast majority of scripts and thus is widely used because it has more features, is well developed and better syntax.
$1 means an input argument and -z means non-defined or empty. You're testing whether an input argument to the script was defined when running the script.
The “set –e” allows the terminal to throw an exception where it finds the error and then the code stop execution. Then, the error function is declared here. The only purpose of this function is to display the error message along with the line number that contains the error.
You script probably has characters in it that you cannot see. Perhaps it was copy/pasted using the wrong character set translation or is in DOS format. In the case of the latter you can use the tofrodos or dos2unix package to correct.
In either case you could pull it up in 'vi' or another application which will usually show weird characters like ^@ or ^M. You could try cat -v filename
which might help see these oddities. Push comes to shove try hexdump (or hd, or od).
I just encountered this and I had invalid line endings. I changed from CRLF to LF and that fixed it!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With