I am having trouble running a python script on a computing cluster and I apologize ahead of time if this is a naive mistake. I'm not sure if the problem originates from me configuring my own conda virtual environment incorrectly but the problem nonetheless is reproduced when I run:
srun -p use-everything --pty python test.py
I get the error
Traceback (most recent call last):
File "test.py", line 4, in <module>
from acme.agents.tf import dqn
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/agents/tf/dqn/__init__.py", line 18, in <module>
from acme.agents.tf.dqn.agent import DQN
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/agents/tf/dqn/agent.py", line 20, in <module>
from acme import datasets
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/datasets/__init__.py", line 17, in <module>
from acme.datasets.reverb import make_reverb_dataset
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/datasets/reverb.py", line 22, in <module>
from acme.adders import reverb as adders
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/adders/reverb/__init__.py", line 21, in <module>
from acme.adders.reverb.base import DEFAULT_PRIORITY_TABLE
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/adders/reverb/base.py", line 26, in <module>
import reverb
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/reverb/__init__.py", line 27, in <module>
from reverb import item_selectors as selectors
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/reverb/item_selectors.py", line 19, in <module>
from reverb import pybind
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/reverb/pybind.py", line 1, in <module>
import tensorflow as _tf; from .libpybind import *; del _tf
ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory
srun: error: node014: task 0: Exited with exit code 1
On my local machine, I was struggling with same issue when I was running a virtual environment, I solved this problem simply with sudo apt-get install libpython3.7
.
Here's some other things that may be helpful to know.
$which libpython
/usr/bin/which: no libpython in (/om2/user/armas/anaconda/envs/dist_rl/bin:/om2/user/armas/anaconda/bin:/om2/user/armas/anaconda/condabin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
$echo $PATH
/om2/user/armas/anaconda/envs/dist_rl/bin:/om2/user/armas/anaconda/bin:/om2/user/armas/anaconda/condabin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
$echo $LD_LIBRARY_PATH
/om2/user/armas/anaconda/bin/
When I change my LD_LIBRARY_PATH
, i.e. export LD_LIBRARY_PATH=/om2/user/armas/anaconda/lib:$LD_LIBRARY_PATH
and run the script, my anaconda thinks I do not have jax installed. I ran pip install dm-acme[jax] and now when I run the script, it was saying I don't have a module named atari_py. I think it is leading me down a chain of dependencies.
I had installed acme using this link, but using a conda environment. My system admin said it could possibly be that acme is not made for anaconda. Why could that be if that is the case?
If there's anything I missed please let me know and I'll be sure to add, thank you again!
Try this:
sudo apt-get install libpython3.7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With