I wrote the following:
val a = 1 to 10000
val b = sc.parallelize(a)
and it shows error saying:
<console>:12: error: not found: value sc
Any help?
In Spark/PySpark 'sc' is a SparkContext object that's created upfront by default on spark-shell/pyspark shell, this object also available in Databricks however when you write PySpark program you need to create SparkSession which internally create SparkContext .
setMaster(master) sc = SparkContext(conf=conf) The appName parameter is a name for your application to show on the cluster UI. master is a Spark, Mesos or YARN cluster URL, or a special “local” string to run in local mode.
You can create an SQLContext in Spark shell by passing a default SparkContext object (sc) as a parameter to the SQLContext constructor.
From Apache spark source code, implicits is an object class inside SparkSession class. The implicits class has extended the SQLImplicits like this : object implicits extends org. apache. spark.
In my case I have spark installed on local windows system and I observed the same error but it was because of below issue
Issue:Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable.
This was because of permission issue.I resolved it by changing the permissions using below command.Though log says "on hdfs" this is on windows system
E:\winutils\bin\winutils.exe chmod 777 E:\tmp\hive
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With