Why do Spark jobs fail with org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0 in speculation mode?

Tags:

apache-spark

I'm running a Spark job with in a speculation mode. I have around 500 tasks and around 500 files of 1 GB gz compressed. I keep getting in each job, for 1-2 tasks, the attached error where it reruns afterward dozens of times (preventing the job to complete).

org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0

Any idea what is the meaning of the problem and how to overcome it?

org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0     at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:384)     at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:381)     at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)     at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)     at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)     at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)     at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)     at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)     at org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:380)     at org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:176)     at org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:42)     at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)     at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:92)     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)     at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)     at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)     at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)     at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)     at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)     at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)     at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)     at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)     at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)     at org.apache.spark.scheduler.Task.run(Task.scala:56)     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)     at java.lang.Thread.run(Thread.java:722)

991

asked Mar 06 '15 14:03

dotan

1 Answers

This happened to me when I gave more memory to the worker node than it has. Since it didn't have swap, spark crashed while trying to store objects for shuffling with no more memory left.

Solution was to either add swap, or configure the worker/executor to use less memory in addition with using MEMORY_AND_DISK storage level for several persists.

159

answered Sep 20 '22 06:09

Joren Van Severen

Related questions
                            
                                Get current number of partitions of a DataFrame
                            
                                How to fix 'TypeError: an integer is required (got type bytes)' error when trying to run pyspark after installing spark 2.4.4
                            
                                Overwrite specific partitions in spark dataframe write method
                            
                                Concatenate two PySpark dataframes
                            
                                Split Spark Dataframe string column into multiple columns
                            
                                How to export a table dataframe in PySpark to csv?
                            
                                Mac spark-shell Error initializing SparkContext
                            
                                How to save DataFrame directly to Hive?
                            
                                How to set up Spark on Windows?
                            
                                At what situation I can use Dask instead of Apache Spark? [closed]
                            
                                What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism?
                            
                                Is there a way to take the first 1000 rows of a Spark Dataframe?
                            
                                How do I set the driver's python version in spark?
                            
                                What are the benefits of Apache Beam over Spark/Flink for batch processing?
                            
                                Renaming column names of a DataFrame in Spark Scala
                            
                                Apache Spark: How to use pyspark with Python 3
                            
                                Spark Error - Unsupported class file major version
                            
                                How to tune spark executor number, cores and executor memory?
                            
                                What does "Stage Skipped" mean in Apache Spark web UI?
                            
                                Convert pyspark string to date format

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With