I'm working with Jupyter Notebook with Pyspark kernel on a node of a cluster, the problem is that my /tmp folder is always full. I already updated the parameters:
SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.appDataTtl=172800"
The problem is that the folder has just 200GB, is there a way to say to spark clean when I shutdown the kernel in Jupyter? Or should I just set Dspark.worker.cleanup.appDataTtl to 30 min, so that every 30 min all the temp files/logs are deleted?
You might try changing the spark.local.dir parameter to a different location having more space.
See: https://spark.apache.org/docs/latest/configuration.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With