Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to execute spark-shell from file with nohup?

Tags:

apache-spark

I have a scala script file that gets successfully executed via interactive spark-shell in a classic way: type spark-shell, paste script, wait till completion.

I want to be capable to leave this thing working and exit ssh session, get back to results when I need.

I tried this and it behaves strangely

spark-shell -i file.scala >> out.log 2>&1 &

It prints only several lines of usual spark output to out.log and then reports that the process has ended. When I do 'ps aux | grep spark' I see there is spark running among processes.

When I run this it behaves as expected, but I have to leave session open to have my results.

spark-shell -i file.scala

Is there a way to get spark-shell working with nohup properly?

I know there is spark-submit working with jars but it feels less intuitive, for a simple test I have to asseble a jar and do maven magic.

like image 745
snowindy Avatar asked Dec 04 '22 02:12

snowindy


1 Answers

I encountered the same behavior of spark-shell with nohup. The reasons behind this are unclear, but one can use tmux instead of nohup as a work-around. A pretty good guide on how to use tmux can be found here.

Possible set of actions is as following:

$ tmux new -s session-name
$ ./bin/spark-shell
# do usual stuff manually

Then if you close the terminal window and exit ssh session, you can re-enter the tmux session like this:

$ tmux attach -t session-name
like image 198
Nikolay Vasiliev Avatar answered Jan 26 '23 06:01

Nikolay Vasiliev