Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark history server filter jobs by user id or time

My spark yarn cluster is used by many users and there are numerous jobs in spark history server. It takes lot of time to paginate through spark history server to locate my job. I couldn't find any option to filter jobs by user id on spark wiki here.

I was wondering, are there any ways to select list of jobs submitted by particular user? or during particular time windows? Thanks.

like image 616
Rahul Sharma Avatar asked Nov 27 '25 23:11

Rahul Sharma


1 Answers

If you are using yarn you can rely on yarn to list and filter you application

yarn application -list | grep -i spark | grep hdpuser 

should list your spark application by hdpuser. Also on the YARN web UI you can see all your jobs and you can filter by different criteria (yarn commands).

Using the REST API, on the path /applications/[app-id]/environment you have the environment details for you spark application(It is only available for 2.2 spark version). Use the property user.name and it's value should be the user name that started the spark job.

Take a look on the listed environment properties on the spark web UI on port 4040 to see all the available properties.

like image 163
dumitru Avatar answered Nov 30 '25 08:11

dumitru



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!