I'm trying to run my program using torque scheduler using mpi run. Though in my pbs file I load all the library by
export LD_LIBRARY_PATH=/path/to/library
yet it gives error i.e.
error while loading shared libraries: libarmadillo.so.3:
cannot open shared object file: No such file or directory.
I guess error lies in variable LD_LIBRARY_PATH not set in all the nodes. How would I make it work?
Mpiexec is a replacement program for the script mpirun, which is part of the mpich package. It is used to initialize a parallel job from within a PBS batch or interactive environment. Mpiexec uses the task manager library of PBS to spawn copies of the executable on the nodes in a PBS allocation.
With OpenMPI, the easiest thing to do is to run ompi_info ; the first few lines will give you the information you want. In your own code, if you don't mind something OpenMPI specific, you can look at use OMPI_MAJOR_VERSION , OMPI_MINOR_VERSION , and OMPI_RELEASE_VERSION in mpi. h.
To do this on our system you need to: (1) have your public rsa key in a file named ~/. ssh/authorized_keys2; and (2) usually run ssh-agent and ssh-add in the terminal from which you will run mpirun (often when starting from a machine to which you are remotely logged into).
LD_LIBRARY_PATH
is not exported automatically to MPI processes, spawned by mpirun
. You should use
mpirun -x LD_LIBRARY_PATH ...
to push the value of LD_LIBRARY_PATH
. Also make sure that the specified path exists on all nodes in the cluster and that libarmadillo.so.3
is available everywhere.
On some systems, your environment isn't always propagated via mpirun
. You should set all those variables in your .bashrc
file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With