I'm trying to run my program using torque scheduler using mpi run. Though in my pbs file I load all the library by
export LD_LIBRARY_PATH=/path/to/library
yet it gives error i.e.
error while loading shared libraries: libarmadillo.so.3:
cannot open shared object file: No such file or directory.
I guess error lies in variable LD_LIBRARY_PATH not set in all the nodes. How would I make it work?
Mpiexec is a replacement program for the script mpirun, which is part of the mpich package. It is used to initialize a parallel job from within a PBS batch or interactive environment. Mpiexec uses the task manager library of PBS to spawn copies of the executable on the nodes in a PBS allocation.
With OpenMPI, the easiest thing to do is to run ompi_info ; the first few lines will give you the information you want. In your own code, if you don't mind something OpenMPI specific, you can look at use OMPI_MAJOR_VERSION , OMPI_MINOR_VERSION , and OMPI_RELEASE_VERSION in mpi. h.
To do this on our system you need to: (1) have your public rsa key in a file named ~/. ssh/authorized_keys2; and (2) usually run ssh-agent and ssh-add in the terminal from which you will run mpirun (often when starting from a machine to which you are remotely logged into).
LD_LIBRARY_PATH is not exported automatically to MPI processes, spawned by mpirun. You should use
mpirun -x LD_LIBRARY_PATH ...
to push the value of LD_LIBRARY_PATH. Also make sure that the specified path exists on all nodes in the cluster and that libarmadillo.so.3 is available everywhere.
On some systems, your environment isn't always propagated via mpirun. You should set all those variables in your .bashrc file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With