Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ERROR SparkContext: Error initializing SparkContext

I am using spark-1.5.0-cdh5.6.0. tried the sample application (scala) command is:

> spark-submit --class com.cloudera.spark.simbox.sparksimbox.WordCount --master local /home/hadoop/work/testspark.jar

Got the following error:

 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/user/spark/applicationHistory does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:424)
        at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:100)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
        at com.cloudera.spark.simbox.sparksimbox.WordCount$.main(WordCount.scala:12)
        at com.cloudera.spark.simbox.sparksimbox.WordCount.main(WordCount.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
like image 357
G.Saleh Avatar asked Mar 16 '16 14:03

G.Saleh


People also ask

Is SparkContext initialized?

Initializing Spark To create a SparkContext you first need to build a SparkConf object that contains information about your application. Only one SparkContext may be active per JVM. You must stop() the active SparkContext before creating a new one.

What is SparkContext?

A SparkContext represents the connection to a Spark cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster. Only one SparkContext should be active per JVM. You must stop() the active SparkContext before creating a new one.

What is a Spark shell?

Spark shell is referred as REPL (Read Eval Print Loop) which is used to quickly test Spark/PySpark statements. The Spark Shell supports only Scala, Python and R (Java might be supported in previous versions). The spark-shell command is used to launch Spark with Scala shell.


1 Answers

Spark has a feature called "history server" which allows you to browse historical events after the SparkContext dies. This property is set via setting spark.eventLog.enabled to true.

You have two options, either specify a valid directory to store the event log via the spark.eventLog.dir config value, or simply set spark.eventLog.enabled to false if you don't need it.

You can read more on that in the Spark Configuration page.

like image 88
Yuval Itzchakov Avatar answered Oct 19 '22 08:10

Yuval Itzchakov