Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

run a python script in qsub

I have a python script main_script.py that looks like this:

import os

Files = os.listdir(os.path.join(path, "."))
FilesNumber = len(Files)

for fileID in range (0,FilesNumber):
    filename = Files[fileID]

    # load file specified in filename and do stuff

basically it does the same kind of operations for each file in the variable Files

I would like to use qsub to parallelize the for loop.

Assuming that I have a txt file files.txt containing all the files names:

//mypath//pathfile1
//mypath//pathfile2
...
//mypath//pathfile100

how can I write the shell script that calls qsub and runs main_script.py I think that I would also need to adapt main_script.py but I do not know how...

The scheduler is Torque/Maui

like image 748
gabboshow Avatar asked Feb 05 '23 22:02

gabboshow


1 Answers

One way to call any executable from a job script is to simply wrap it inside a bash script:

#/bin/bash

<full path to call executable>

If you name that script script.sh, and script.sh is executable, then you can execute:

qsub script.sh

and it will be submitted to the batch system. The gotchas - which you may well already know - are things like: if your executable isn't accessible from the compute node, then it won't be found when the job executes. The same is true for files that your script is using, so you'll want to make sure they're all located appropriately, usually a network-accessible filesystem.

If you wanted to directly submit the python script, you can add:

#!/usr/bin/python 

to the top (double-check that python is in /usr/bin on your system) and then you can directly qsub your python script. In your case,

qsub main_script.py

When submitted this way, the script no longer has to be in a network-accessible location, but the input files still do.

like image 160
dbeer Avatar answered Feb 19 '23 15:02

dbeer