I am looking for some general advice rather than a coding solution. Basically when submitting a job via bsub I can retrieve a log of the Stdin/Stdout by specifying any of the following:
bsub -o log.txt % sends StdOut to log.txt
bsub -u me@email % sends StdOut to email
these are both great, but my program creates a folder when submitted to bsub and is stored on the remote server. essentially I want to
a) retrieve the folder and it's contents b) do this automatically when the job finishes
so I could technically to a by using scp -r
, however I would have to do this manually. not too bad if I get an email alert when the job is finished, but still - I'd have to manually do this.
so onto b):
well I can't see any special flag for bsub to retreive the actual results, only StdOut. I suppose I could have a script which uses sleep
and sets to the job time (perhaps a bit linger just to be safe), something like
#!/bin/bash
scp myfile.txt server:main/subfolder
ssh bsub < myprogram.sh -u my@email
sleep <job-time>
scp -r server:main/subfolder result_folder
however I am slightly concerned about being logged out etc and the script terminating before the job is finished.
does anyone have any suggestions?
I essentially want to have a interface (website in future) where user can submit a file, file is analysed remotely, user is sent emails when job starts/finishes, results automatically retrieved back to local/webserver, user gets email saying they can pick up their results.
one step at a time though!
You can tar your results directory to stdout, into your logfile. Then un-tar the logfile to retrieve the directory.
Add the tar czf - ...
command to the end of your script.
If you have other stuff appearing on stdout first, move it to stderr instead, or echo some unique string before the tar, grep for it, and tar from there. Here's a sort of test of the principle:
marker='#magic' # some unique string
log=/tmp/b # your logfile
echo 'test' >/tmp/a # just something to tar for this test
# -- in your script, at end --
# echo "$marker"; tar cf - /tmp/a
# -- equivalent in this test:
(echo 'hello'; echo "$marker"; tar cf - /tmp/a) >$log
# -- to recover the tar --
start=$(grep -ab "$marker" <$log | awk -F: '{print 1+$1+length($2)}')
dd skip=1 bs=$start <$log |
tar tvf - # use tar x really
You can submit the job in blocking mode (bsub -K). This makes the bsub
command return only when the job is complete or an error was found.
Quote from documentation:
-K
Submits a job and waits for the job to complete. Sends the message "Waiting for dispatch" to the terminal when you submit the job. Sends the message "Job is finished" to the terminal when the job is done. If LSB_SUBK_SHOW_EXEC_HOST is enabled in lsf.conf, also sends the message "Starting on execution_host" when the job starts running on the execution host.
You are not able to submit another job until the job is completed. This is useful when completion of the job is required to proceed, such as a job script. If the job needs to be rerun due to transient failures, bsub returns after the job finishes successfully. bsub exits with the same exit code as the job so that job scripts can take appropriate actions based on the exit codes. bsub exits with value 126 if the job was terminated while pending.
You cannot use the -K option with the -I, -Ip, or -Is options.
Next, you could run scp
or a similar program to automatically copy the results from the remote host without checking your email. :)
You could also prefix your wrapper script with nohup to prevent it from being killed if the session logs out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With