Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I run spark-submit in jupyter notebook?

I have tried to run a spark-submit job in a jupyter notebook to pull data from a network database:

!spark-submit --packages org.mongodb.spark:mongo-spark-connector_2.10:2.0.0 script.py

and got this message:

jupyter: '/home/user/script.py' is not a Jupyter command

Is there an option to submit from the notebook.

KR

like image 868
Mario L Avatar asked Sep 17 '25 21:09

Mario L


1 Answers

If its an ipykernel, i do not see a requirement to do a spark submit, you are already in interactive spark mode where sparkContext and sqlContext is already created and available for the whole session you kernel is up. Seems like you are trying to create a cascade sort-of operation i.e. have spark application inside spark application and so on. You cannot have that with Spark.

you can start a normal python kernel and then run spark-submit as a shell command using Popen or other such libraries and functions.

like image 80
joshi.n Avatar answered Sep 19 '25 09:09

joshi.n