Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can spark-submit with named argument?

I know i can pass argument to main function by

spark-submit com.xxx.test 1 2

and get argument by:

def main(args: Array[String]): Unit = {
    // 读取参数
    var city = args(0)
    var num = args(1)

but i want to know is there a path to pass named argument like:

spark-submit com.xxx.test --citys=1 --num=2

and how to get this named argument in main.scala?

like image 495
jianfeng Avatar asked Oct 20 '17 08:10

jianfeng


People also ask

What happens when we submit a spark submit?

Once you do a Spark submit, a driver program is launched and this requests for resources to the cluster manager and at the same time the main program of the user function of the user processing program is initiated by the driver program.

Can you explain what happens internally when we submit a spark job using spark submit?

With this in mind, when you submit an application to the cluster with spark-submit this is what happens internally: => A standalone application starts and instantiates a SparkContext instance (and it is only then when you can call the application a driver).


1 Answers

you can write your own custom class which parses the input arguments based on the key something like below:

object CommandLineUtil {

  def getOpts(args: Array[String], usage: String): collection.mutable.Map[String, String] = {
    if (args.length == 0) {
      log.warn(usage)
      System.exit(1)
    }

    val (opts, vals) = args.partition {
      _.startsWith("-")
    }

    val optsMap = collection.mutable.Map[String, String]()
    opts.map { x =>
      val pair = x.split("=")
      if (pair.length == 2) {
        optsMap += (pair(0).split("-{1,2}")(1) -> pair(1))
      } else {
        log.warn(usage)
        System.exit(1)
      }
    }

    optsMap
  }
}

Then you can use the methods with in your spark application

val usage = "Usage:  [--citys] [--num]"
val optsMap = CommandLineUtil.getOpts(args, usage)
val citysValue = optsMap("citys")
val numValue = optsMap("num")

You can improvise CommandLineUtil as per your requirements

like image 128
Prasad Khode Avatar answered Oct 12 '22 10:10

Prasad Khode