Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get a spark job's metrics?

we have a cluster which has about 20 nodes. This cluster is shared among many users and jobs. Therefore, it is very difficult for me to observe my job so that I can get some metrics such as CPU usage, I/O, Network, Memory etc...

How can I get a metrics on job level.

PS: The cluster already have Ganglia installed but not sure how I could get it to work on the job level. What I would like to do is monitor the resource used by the cluster to execute my job only.

like image 736
diplomaticguru Avatar asked Dec 07 '15 17:12

diplomaticguru


1 Answers

You can get the spark job metrics from Spark History Server, which displays information about:
- A list of scheduler stages and tasks
- A summary of RDD sizes and memory usage
- A Environmental information
- A Information about the running executors

1, Set spark.eventLog.enabled to true before starting the spark application. This configures Spark to log Spark events to persisted storage.
2, Set spark.history.fs.logDirectory, this is the directory that contains application event logs to be loaded by the history server;
3, Start the history server by executing: ./sbin/start-history-server.sh

please refer to below link for more information:
http://spark.apache.org/docs/latest/monitoring.html

like image 121
Shawn Guo Avatar answered Oct 02 '22 20:10

Shawn Guo