Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to launch Spark's ApplicationMaster on a particular node in YARN cluster?

I have a YARN cluster with a master node running resource manager and 2 other nodes. I am able to submit a spark application from a client machine in "yarn-cluster" mode. Is there a way I can configure which node in the cluster launches the Spark application master?

I ask this because if application master launches in master node it works fine but if it starts in other nodes I get this:

Retrying connect to server: 0.0.0.0/0.0.0.0:8030.

and the job is simply accepted and never runs

like image 976
Sporty Avatar asked Mar 18 '23 04:03

Sporty


1 Answers

If you're using a new enough version of YARN (2.6 or newer, according to Spark docs), you can use node labels in YARN.

This Hortonworks guide walks through applying node labels to your YARN NodeManagers.

If you use Spark 1.6 or newer, then this JIRA added support for using the YARN node labels in Spark; you then simply pass spark.yarn.am.nodeLabelExpression to restrict AppMaster node placement, and if you ever need it, spark.yarn.executor.nodeLabelExpression for executor placement.

like image 107
Dennis Huo Avatar answered Mar 21 '23 12:03

Dennis Huo