Deploy spark driver application without spark submit

Tags:

apache-spark

Let's suppose we have a spark driver program written like this:

public class SimpleApp {
  public static void main(String[] args) {
    String logFile = "YOUR_SPARK_HOME/README.md"; // Should be some file on your system
    SparkConf conf = new SparkConf().setAppName("Simple Application");
    JavaSparkContext sc = new JavaSparkContext(conf);
    JavaRDD<String> logData = sc.textFile(logFile).cache();

    long numAs = logData.filter(new Function<String, Boolean>() {
      public Boolean call(String s) { return s.contains("a"); }
    }).count();

    long numBs = logData.filter(new Function<String, Boolean>() {
      public Boolean call(String s) { return s.contains("b"); }
    }).count();

    System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
  }
}

and I want to run in a yarn cluster, can I avoid using spark-submit and (supposing of course I have access to one cluster node ), by just specifying in the context I want to run on yarn? In other words, is it possible to launch the spark client as a regular java app leveraging yarn?

752

asked Sep 09 '16 05:09

Felice Pollano

1 Answers

Here is another official way to do it.

Spark Launcher - Library for launching Spark applications.

This library allows applications to launch Spark programmatically. There's only one entry point to the library - the SparkLauncher class.

To launch a Spark application, just instantiate a SparkLauncher and configure the application to run. For example:

 import org.apache.spark.launcher.SparkLauncher;

   public class MyLauncher {
     public static void main(String[] args) throws Exception {
       Process spark = new SparkLauncher()
         .setAppResource("/my/app.jar")
         .setMainClass("my.spark.app.Main")
         .setMaster("local")
         .setConf(SparkLauncher.DRIVER_MEMORY, "2g")
         .launch();
       spark.waitFor();
     }
   }

You can set all the YARN specific config using setConf method and set the master to yarn-client or yarn-cluster

References: https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/launcher/package-summary.html

answered Oct 03 '22 08:10

Rakesh Rakshit

Related questions
                            
                                How do I use a regex to find a consecutive repeat in a string (i.e. [12][12]) but only of length 2 or greater?
                            
                                How can a Elasticsearch client be notified of a new indexed document?
                            
                                Java .Class file change string
                            
                                Scaling an Image and positioning it at 0,0 in WPF
                            
                                How to use auto-value with firebase 9.2 in Android
                            
                                Pagination for update. Is it possible?
                            
                                Camel multicast - transaction boundary
                            
                                How to make java.io.BufferedOutputStream secured for memory scraper for sensitive card data?
                            
                                After Selecting date the date picker should close in android without clicking okay button
                            
                                android java regex match all but one character
                            
                                Checkstyle Java generics: '?' is not preceded with whitespace
                            
                                camel thread pooling query
                            
                                How do I simulate a client aborting request?
                            
                                Apache Camel. Throttle Part of the Route
                            
                                Programmatically add node in AEM?
                            
                                Java API to HBase exception:cannot get location
                            
                                How to decrease heartbeat time of slave nodes in Hadoop
                            
                                Android BLE: Identify the Characteristic Type?
                            
                                Format JodaTime DateTime with preferred DateFormat
                            
                                Docker. Spring application. set & get environment variable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With