Understanding Kryo serialization buffer overflow error

Tags:

I am trying to understand the following error and I am running in client ode.

 org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 61186304. To avoid this, increase spark.kryoserializer.buffer.max value.
        at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:300)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:313)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Basically I am trying to narrow down the problem. Is my understanding right that this error is occurring in the spark driver side(i am on aws emr so I believe this will be running on master)? and I should be looking at spark.driver.memory ?

625

asked Apr 01 '18 04:04

soupybionics

1 Answers

No, the problem is that kryo does not have enough room in its buffer. You should be adjusting spark.kryoserializer.buffer.max in your properties file, or use --conf "spark.kryoserializer.buffer.max=128m" in your spark-submit command. 128m should be big enough for you.

191

answered Oct 12 '22 22:10

Mike Pone

Related questions
                            
                                Scala IDE for data science applications (like RStudio / Spyder / Rodeo)
                            
                                Akka-Http load css&js resources
                            
                                Spark 2.0 ALS Recommendation how to recommend to a user
                            
                                Nice syntax for function composition in Scala
                            
                                NetBeans 8.2 External Execution Base API Missing for Scala Plugins
                            
                                Currying vs. anonymous function in Scala
                            
                                Dynamic order by in scala slick with several columns
                            
                                Converting pattern of date in spark dataframe
                            
                                Pattern matching on a simple Shapeless HLIST
                            
                                How to convert RDD[Row] to RDD[String]
                            
                                What is the faster way to count the number of entries in a data frame?
                            
                                Path-dependent types without a path?
                            
                                Spark Scala Dataframe convert a column of Array of Struct to a column of Map
                            
                                Validation error while trying to parse a json array to List[Object] in Scala
                            
                                Play Json: custom reads one field
                            
                                How to create a Dataset of Maps?
                            
                                Spark Structured Streaming with Hbase integration
                            
                                Differences between Anonymous, Singleton & Companion Object in Scala
                            
                                Scala Cats or Scalaz typeclass scanLeft like
                            
                                How to implement Functor[Dataset]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Understanding Kryo serialization buffer overflow error

Tags:

scala

apache-spark

kryo

soupybionics

People also ask

1 Answers

Mike Pone

Recent Activity

Donate For Us