Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SparkSession and context confusion

I have a pyspark 2.0.0 script with the following session defined:

spark = SparkSession \
    .builder \
    .appName("Python Spark") \
    .master("local[*]")\
    .config("spark.some.config.option", "some-value") \
    .getOrCreate()

I trained a random forest model and I want to save it. Therefore I am calling the following method:

model_rf.save( spark, "/home/Desktop")

but it throws the following compilation error:

TypeError: sc should be a SparkContext, got type <class 'pyspark.sql.session.SparkSession'>

when I define a Spark context, like so:

from pyspark import SparkContext
sc =SparkContext()
model_rf.save( sc, "/home/Desktop")

I am getting the error:

Cannot run multiple SparkContexts at once; existing SparkContext(app=Python Spark, master=local[*]) created by getOrCreate at <ipython-input-1-c5f83810f880>:24 
like image 560
Kratos Avatar asked Dec 21 '16 16:12

Kratos


1 Answers

use spark.sparkContext(SparkSession object will have sparkContext)

model_rf.save( spark.sparkContext, "/home/Desktop")
like image 143
mrsrinivas Avatar answered Oct 10 '22 15:10

mrsrinivas