Spark job is failed due to java.io.NotSerializableException: org.apache.spark.SparkContext

Tags:

I am facing above exception when I am trying to apply a method(ComputeDwt) on RDD[(Int,ArrayBuffer[(Int,Double)])] input. I am even using extends Serialization option to serialize objects in spark.Here is the code snippet.

input:series:RDD[(Int,ArrayBuffer[(Int,Double)])] 
DWTsample extends Serialization is a class having computeDwt function.
sc: sparkContext

val  kk:RDD[(Int,List[Double])]=series.map(t=>(t._1,new DWTsample().computeDwt(sc,t._2)))

Error:
org.apache.spark.SparkException: Job failed: java.io.NotSerializableException: org.apache.spark.SparkContext
org.apache.spark.SparkException: Job failed: java.io.NotSerializableException: org.apache.spark.SparkContext
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:556)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:503)
at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:361)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)
at org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)

Could anyone suggest me what could be the problem and what should be done to overcome this issue?

280

asked May 12 '14 09:05

yh18190

1 Answers

The line

series.map(t=>(t._1,new DWTsample().computeDwt(sc,t._2)))

references the SparkContext (sc) but SparkContext isn't serializable. SparkContext is designed to expose operations that are run on the driver; it can't be referenced/used by code that's run on workers.

You'll have to re-structure your code so that sc isn't referenced in your map function closure.

109

answered Oct 13 '22 00:10

Josh Rosen

Related questions
                            
                                How can I get all elements from drop down list in Selenium WebDriver?
                            
                                property not found with multiple context:property-placeholder
                            
                                How to decompile volatile variable in Java?
                            
                                Authorization with RolesAllowedDynamicFeature and Jersey
                            
                                How to remove milliseconds, seconds, minutes and hours from Date [duplicate]
                            
                                Can't Uninstall Java 7 JDK on Mac OS X (Mountain Lion 10.8.4)
                            
                                doAnswer for static methods - PowerMock
                            
                                Postgres : No suitable Driver found for jdbc
                            
                                Trouble loading .png file using LibGDX Gdx.files.internal
                            
                                IOException vs RuntimeException Java
                            
                                java.lang.NoClassDefFoundError: in eclipse maven
                            
                                Regex: how to know that string contains at least 2 upper case letters?
                            
                                Difference between Apache POI api and Apache Tika Api?
                            
                                FileNotFoundException when using java properties file
                            
                                readAllLines Charset in Java
                            
                                Get all text fields values and id in javafx
                            
                                Maven exec plugin ClassNotFoundException
                            
                                web.xml and/or Filters to return welcome-file
                            
                                3D Ray-Quad intersection test in java
                            
                                Understanding Map in C++ as a Java developer [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark job is failed due to java.io.NotSerializableException: org.apache.spark.SparkContext

Tags:

java

scala

apache-spark

hadoop

yh18190

People also ask

1 Answers

Josh Rosen

Recent Activity

Donate For Us