Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Zeppelin: How to restart sparkContext in zeppelin

I am using Isolated mode of zeppelins spark interpreter, with this mode it will start a new job for each notebook in spark cluster. I want to kill the job via zeppelin when the notebook execution is completed. For this I did sc.stop this stopped the sparkContext and the job is also stopped from spark cluster. But next time when I try to run the notebook its not starting the sparkContext again. So how to do that?

like image 516
eatSleepCode Avatar asked Nov 11 '16 14:11

eatSleepCode


3 Answers

It's a bit counter intuitive but you need to access the interpreter menu tab instead of stopping SparkContext directly:

  • go to interpreter list.

    interpreter list

  • find Spark interpreter and click restart in the right upper corner:

    spark intepreter

like image 74
2 revs, 2 users 96%user6022341 Avatar answered Nov 09 '22 21:11

2 revs, 2 users 96%user6022341


You can restart the interpreter for the notebook in the interpreter bindings (gear in upper right hand corner) by clicking on the restart icon to the left of the interpreter in question (in this case it would be the spark interpreter).

https://i.stack.imgur.com/MAm7a.png

like image 9
Mark Brown Avatar answered Nov 09 '22 21:11

Mark Brown


While working with Zeppelin and Spark I also stumbled upon the same problem and made some investigations. After some time, my first conclusion was that:

  • Stopping the SparkContext can be accomplished by using sc.stop() in a paragraph
  • Restarting the SparkContext only works by using the UI (Menu -> Interpreter -> Spark Interpreter -> click on restart button)

However, since the UI allows restarting the Spark Interpreter via a button press, why not just reverse engineer the API call of the restart button! The result was, that restarting the Spark Interpreter sends the following HTTP request:

PUT http://localhost:8080/api/interpreter/setting/restart/spark

Fortunately, Zeppelin has the ability to work with multiple interpreters, where one of them is also a shell Interpreter. Therefore, i created two paragraphs:

The first paragraph was for stopping the SparkContext whenever needed:

%spark
// stop SparkContext
sc.stop()

The second paragraph was for restarting the SparkContext programmatically:

%sh
# restart SparkContext
curl -X PUT http://localhost:8080/api/interpreter/setting/restart/spark

After stopping and restarting the SparkContext with the two paragraphs, I run another paragraph to check if restarting worked...and it worked! So while this is no official solution and more of a workaround, it is still legit as we do nothing else than "pressing" the restart button within a paragraph!

Zeppelin version: 0.8.1

like image 7
bajro Avatar answered Nov 09 '22 21:11

bajro