I have a strange issue that comes and goes randomly and I really can't figure out when and why.
I am running a snakemake pipeline like this:
conda activate $myEnv
snakemake -s $snakefile --configfile test.conf.yml --cluster "python $qsub_script" --latency-wait 60 --use-conda -p -j 10 --jobscript "$job_script"
I installed snakemake 5.9.1 (also tried downgrading to 5.5.4) within a conda environment.
This works fine if I just run this command, but when I qsub this command to the PBS cluster I'm using, I get an error. My qsub script looks like this:
#PBS stuff...
source ~/.bashrc
hostname
conda activate PGC_de_novo
cd $workDir
snakefile="..."
qsub_script="pbs_qsub_snakemake_wrapper.py"
job_script="..."
snakemake -s $snakefile --configfile test.conf.yml --cluster "python $qsub_script" --latency-wait 60 --use-conda -p -j 10 --jobscript "$job_script" >out 2>err
And the error message I get is:
...
Traceback (most recent call last):
File "/path/to/pbs_qsub_snakemake_wrapper.py", line 6, in <module>
from snakemake.utils import read_job_properties
ImportError: No module named snakemake.utils
Error submitting jobscript (exit code 1):
...
So it looks like for some reason my cluster script doesn't find snakemake, although snakemake is clearly installed. As I said, this problem keeps coming and going. It'd stay for a few hours, then go away for now aparent reason. I guess this indicates an environment problem, but I really can't figure out what, and ran out of ideas. I've tried:
but nothing. Any ideas where to look? Thanks!
Following @Manavalan Gajapathy's advice, I added print(sys.version)
commands both to the snakefile and the cluster script, and in both cases got a python version (2.7.5) different than the one indicated in the activated environment (3.7.5).
To cut a long story short - for some reason when I activate the environment within a PBS job, the environment path is added to the $PATH only after /usr/bin, which results in /usr/bin/python being used (which does not have the snakemake package). When the env is activated locally, the env path is added to the beginning of the $PATH, so the right python is used.
I still don't understand this behavior, but at least I could work around it by changing the #PATH. I guess this is not a very elegant solution, but it works for me.
A possibility could be that some cluster nodes don't find the path to the snakemake package so when a job is submitted to those nodes you get the error.
I don't know if/how that could happen but if that is the case you could find the incriminated nodes with something like:
for node in pbsnodes
do
echo $node
ssh $node 'python -c "from snakemake.utils import read_job_properties"'
done
(for nodes in pbsnodes
iterates through the available nodes - I don't have the exact syntax right now but hopefully you get the idea). This at least would narrow down the problem a bit...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With