I have a MapReduce task which I want to run on Spark YARN cluster from my java code. Also I want to retrieve reduce result (string and number pair, tuple) in my java code. Something like:
// I know, it's wrong setMaster("YARN"), but just to describe what I want.
// I want to execute job ob the cluster.
SparkConf sparkConf = new SparkConf().setAppName("Test").setMaster("YARN");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
JavaRDD<Integer> input = sc.parallelize(list);
// map
JavaPairRDD<String, Integer> results = input.mapToPair(new MapToPairExample());
// reduce
String max = results.max(new MyResultsComparator())._1();
It works if I set master to local, local[] or spark://master:7707.
So the question is: can I do the same with yarn cluster somehow?
Typically, a spark-submit command works the following way when passing the master as yarn and deploy mode as cluster (source: Github code base for spark):
In this flow, Steps 1-5 happen on the client/gateway machine. Starting from Step 6, everything executes on the Yarn cluster.
Now, to answer your question, I haven't ever tried executing spark in yarn-cluster mode from the code, but based on the above flow, your piece of code can only run within an application master container in a Node Manager machine of the Yarn cluster if you wish it to run in yarn-cluster mode. And, your code can reach there only if you specify spark-submit --master yarn --deploy-mode cluster from the command line. So specifying it in the code and:
Any corrections to this are welcome!
You need to do it using spark-submit. Spark submit handles many things for you from shipping dependencies to cluster and setting correct classpaths etc. When you are running it as main java program in local mode your IDE is taking care of the classpath(since driver/executors are running in same jvm).
You can also use "yarn-client"
mode if you want your driver program to run on your machine.
For yarn-cluster mode use .setMaster("yarn-cluster")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With