Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

snakemake cluster script ImportError snakemake.utils

Tags:

I have a strange issue that comes and goes randomly and I really can't figure out when and why.
I am running a snakemake pipeline like this:

conda activate $myEnv    
snakemake -s $snakefile --configfile test.conf.yml --cluster "python $qsub_script" --latency-wait 60 --use-conda -p -j 10 --jobscript "$job_script"

I installed snakemake 5.9.1 (also tried downgrading to 5.5.4) within a conda environment.
This works fine if I just run this command, but when I qsub this command to the PBS cluster I'm using, I get an error. My qsub script looks like this:

#PBS stuff...

source ~/.bashrc
hostname
conda activate PGC_de_novo

cd $workDir
snakefile="..."
qsub_script="pbs_qsub_snakemake_wrapper.py"
job_script="..."
snakemake -s $snakefile --configfile test.conf.yml --cluster "python $qsub_script" --latency-wait 60 --use-conda -p -j 10 --jobscript "$job_script" >out 2>err

And the error message I get is:

...
Traceback (most recent call last):
  File "/path/to/pbs_qsub_snakemake_wrapper.py", line 6, in <module>
    from snakemake.utils import read_job_properties
ImportError: No module named snakemake.utils
Error submitting jobscript (exit code 1):
...

So it looks like for some reason my cluster script doesn't find snakemake, although snakemake is clearly installed. As I said, this problem keeps coming and going. It'd stay for a few hours, then go away for now aparent reason. I guess this indicates an environment problem, but I really can't figure out what, and ran out of ideas. I've tried:

  • different conda versions
  • different snakemake versions
  • different nodes on the cluster
  • ssh to the node it just failed on and try to reproduce the error

but nothing. Any ideas where to look? Thanks!

like image 696
soungalo Avatar asked Dec 26 '19 20:12

soungalo


2 Answers

Following @Manavalan Gajapathy's advice, I added print(sys.version) commands both to the snakefile and the cluster script, and in both cases got a python version (2.7.5) different than the one indicated in the activated environment (3.7.5).
To cut a long story short - for some reason when I activate the environment within a PBS job, the environment path is added to the $PATH only after /usr/bin, which results in /usr/bin/python being used (which does not have the snakemake package). When the env is activated locally, the env path is added to the beginning of the $PATH, so the right python is used.
I still don't understand this behavior, but at least I could work around it by changing the #PATH. I guess this is not a very elegant solution, but it works for me.

like image 197
soungalo Avatar answered Oct 20 '22 05:10

soungalo


A possibility could be that some cluster nodes don't find the path to the snakemake package so when a job is submitted to those nodes you get the error.

I don't know if/how that could happen but if that is the case you could find the incriminated nodes with something like:

for node in pbsnodes
do
    echo $node
    ssh $node 'python -c "from snakemake.utils import read_job_properties"'
done

(for nodes in pbsnodes iterates through the available nodes - I don't have the exact syntax right now but hopefully you get the idea). This at least would narrow down the problem a bit...

like image 21
dariober Avatar answered Oct 20 '22 05:10

dariober