Spark history server stops working in EMR when logs get large

Question

I am running a spark job on a 10 TB dataset using EMR. I am using the Spark history server to monitor its progress. However, when the logs get really large, the spark history server and the EMR UI both stop updating. Is my EMR job still running or has it stopped working too?

Furthermore, when the spark history server stops crashing, all my EC2 instances go from > 75% CPU utilization to 0% utilization (they subsequently increase back to <75%) and the EMR console shows 0 containers reserved and all memory freed (they also return to normal after).

Has something happened to my EMR job? Is there a way I can keep the Spark history server working when the logs get really large?

Thanks.

A.B · Accepted Answer

Yes this can happen due to large amount of log history, you can try to schedule/set auto delete.

For history log cleanup, you can set following properties to enable set auto cleaning in spark-defaults.conf file, and restart the server

spark.history.fs.cleaner.enabled true
spark.history.fs.cleaner.maxAge  12h
spark.history.fs.cleaner.interval 1h

Spark history server stops working in EMR when logs get large

Tags:

apache-spark

amazon-emr

Chris

1 Answers

A.B

Recent Activity

Donate For Us

Spark history server stops working in EMR when logs get large

Tags:

apache-spark

amazon-emr

Chris

1 Answers

A.B

Related questions

Recent Activity

Donate For Us