Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark history server stops working in EMR when logs get large

I am running a spark job on a 10 TB dataset using EMR. I am using the Spark history server to monitor its progress. However, when the logs get really large, the spark history server and the EMR UI both stop updating. Is my EMR job still running or has it stopped working too?

Furthermore, when the spark history server stops crashing, all my EC2 instances go from > 75% CPU utilization to 0% utilization (they subsequently increase back to <75%) and the EMR console shows 0 containers reserved and all memory freed (they also return to normal after).

Has something happened to my EMR job? Is there a way I can keep the Spark history server working when the logs get really large?

Thanks.

like image 629
Chris Avatar asked Nov 25 '25 10:11

Chris


1 Answers

Yes this can happen due to large amount of log history, you can try to schedule/set auto delete.

For history log cleanup, you can set following properties to enable set auto cleaning in spark-defaults.conf file, and restart the server

spark.history.fs.cleaner.enabled true
spark.history.fs.cleaner.maxAge  12h
spark.history.fs.cleaner.interval 1h
like image 164
A.B Avatar answered Nov 27 '25 14:11

A.B



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!