When performing a shuffle my Spark job fails and says "no space left on device", but when I run df -h
it says I have free space left! Why does this happen, and how can I fix it?
No space left on device error often means you are over quota in the directory you're trying to create or move files to.
By default Spark
uses the /tmp
directory to store intermediate data. If you actually do have space left on some device -- you can alter this by creating the file SPARK_HOME/conf/spark-defaults.conf
and adding the line. Here SPARK_HOME
is wherever you root directory for the spark install is.
spark.local.dir SOME/DIR/WHERE/YOU/HAVE/SPACE
You need to also monitor df -i
which shows how many inodes are in use.
on each machine, we create M * R temporary files for shuffle, where M = number of map tasks, R = number of reduce tasks.
https://spark-project.atlassian.net/browse/SPARK-751
If you do indeed see that disks are running out of inodes to fix the problem you can:
coalesce
with shuffle = false
).spark.shuffle.consolidateFiles
and see https://spark-project.atlassian.net/secure/attachment/10600/Consolidating%20Shuffle%20Files%20in%20Spark.pdf. EDIT
Consolidating files has been removed from spark since version 1.6. https://issues.apache.org/jira/browse/SPARK-9808
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With