Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Spark: Garbage Collection Logs for Driver

My Spark driver runs out of memory after running for about 10 hours with the error Exception in thread "dispatcher-event-loop-17" java.lang.OutOfMemoryError: GC overhead limit exceeded. To further debug, I enabled G1GC mode and also the GC logs option using spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties -XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp but it looks like it is not taking effect on the driver.

The job got stuck again on the driver after 10 hours and I dont see any GC logs under stdout on the driver node under /var/log/hadoop-yar/userlogs/[application-id]/[container-id]/stdout - so not sure where else to look. According to Spark GC tuning docs, it looks like these settings only happen on worker nodes (which I can see in this case as well as workers have GC logs in stdout after I had used the same configs under spark.executor.extraJavaOptions). Is there anyway to enable/acquire GC logs from the driver? Under Spark UI -> Environment, I see these options are listed under spark.driver.extraJavaOptions which is why I assumed it would be working.

Environment: The cluster is running on Google Dataproc and I use /usr/bin/spark-submit --master yarn --deploy-mode cluster ... from the master to submit jobs.

EDIT Setting the same options for the driver during the spark-submit command works and I am able to see the GC logs on stdout for the driver. Just that setting the options via SparkConf programmatically does not seem to take effect for some reason.

like image 440
noobNeverything Avatar asked Dec 09 '25 13:12

noobNeverything


1 Answers

I believe spark.driver.extraJavaOptions is handled by SparkSubmit.scala, and needs to be passed at invocation. To do that with Dataproc you can add that to the properties field (--properties in gcloud dataproc jobs submit spark).

Also instead of -Dlog4j.configuration=log4j.properties you can use this guide to configure detailed logging.

I could see GC driver logs with: gcloud dataproc jobs submit spark --cluster CLUSTER_NAME --class org.apache.spark.examples.SparkPi --jars file:///usr/lib/spark/examples/jars/spark-examples.jar --driver-log-levels ROOT=DEBUG --properties=spark.driver.extraJavaOptions="-XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp" --

You probably don't need --driver-log-levels ROOT=DEBUG, but can copy in your logging config from log4j.properties. If you really want to use log4j.properties, you can probably use --files log4j.properties

like image 60
Patrick Clay Avatar answered Dec 11 '25 03:12

Patrick Clay



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!