How to get a spark job's metrics?

Question

we have a cluster which has about 20 nodes. This cluster is shared among many users and jobs. Therefore, it is very difficult for me to observe my job so that I can get some metrics such as CPU usage, I/O, Network, Memory etc...

How can I get a metrics on job level.

PS: The cluster already have Ganglia installed but not sure how I could get it to work on the job level. What I would like to do is monitor the resource used by the cluster to execute my job only.

Shawn Guo · Accepted Answer

You can get the spark job metrics from Spark History Server, which displays information about:
- A list of scheduler stages and tasks
- A summary of RDD sizes and memory usage
- A Environmental information
- A Information about the running executors

1, Set spark.eventLog.enabled to true before starting the spark application. This configures Spark to log Spark events to persisted storage.
2, Set spark.history.fs.logDirectory, this is the directory that contains application event logs to be loaded by the history server;
3, Start the history server by executing: ./sbin/start-history-server.sh

please refer to below link for more information:
http://spark.apache.org/docs/latest/monitoring.html

How to get a spark job's metrics?

Tags:

performance

apache-spark

hadoop

ganglia

diplomaticguru

1 Answers

Shawn Guo

Recent Activity

Donate For Us

How to get a spark job's metrics?

Tags:

performance

apache-spark

hadoop

ganglia

diplomaticguru

1 Answers

Shawn Guo

Related questions

Recent Activity

Donate For Us