I was going through one of the presentation on spark memory management and wanted to know how to get a good graphical picture of executor memory usage (something similar to what was mentioned in presentation), to understand out of memory errors better. Also, what is the best way to analyze off-heap memory usage in spark executors? How to find the amount of off-heap memory usage as a function of time?
I looked into Ganglia but it gives node level metrics. I found it hard to understand executor level memory usage using node level metrics.
RM UI also displays the total memory per application. Spark UI - Checking the spark ui is not practical in our case. RM UI - Yarn UI seems to display the total memory consumption of spark app that has executors and driver.
Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30. Leaving 1 executor for ApplicationManager => --num-executors = 29. Number of executors per node = 30/10 = 3. Memory per executor = 64GB/3 = 21GB.
Memory overhead is the amount of off-heap memory allocated to each executor. By default, memory overhead is set to either 10% of executor memory or 384, whichever is higher.
Click Analytics > Spark Analytics > Open the Spark Application Monitoring Page. Click Monitor > Workloads, and then click the Spark tab. This page displays the user names of the clusters that you are authorized to monitor and the number of applications that are currently running in each cluster.
I've been thinking about a similar tool!
I think org.apache.spark.scheduler.SparkListener is the interface to all the low-level metrics in Apache Spark with onExecutorMetricsUpdate
being the method to look at when developing a higher-level monitoring tool.
You could also monitor JVM using JMX interface, but it might be too low-level and definitely without the contextual information on how Spark uses the resources.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With