I try to run example of spark-ml, but
from pyspark import SparkContext
import pyspark.sql
sc = SparkContext(appName="PythonStreamingQueueStream")
training = sqlContext.createDataFrame([
(1.0, Vectors.dense([0.0, 1.1, 0.1])),
(0.0, Vectors.dense([2.0, 1.0, -1.0])),
(0.0, Vectors.dense([2.0, 1.3, 1.0])),
(1.0, Vectors.dense([0.0, 1.2, -0.5]))], ["label", "features"])
cannot run because terminal tells me that
NameError: name 'SQLContext' is not defined
Why this happened? How can I solve it?
If you are using Apache Spark 1.x line (i.e. prior to Apache Spark 2.0), to access the sqlContext
, you would need to import the sqlContext
; i.e.
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
If you're using Apache Spark 2.0, you can just the Spark Session
directly instead. Therefore your code will be
training = spark.createDataFrame(...)
For more information, please refer to the Spark SQL Programing Guide.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With