I am calling
val appName : String = arguments.getNameFromConfig
val conf = new SparkConf()
conf.set("spark.driver.maxResultSize", "30G")
conf.set("spark.app.name", appName)
println("Master: " + arguments.getMaster)
conf.setMaster(arguments.getMaster)
val sc = new SparkContext(conf)
in order to identify my jobs in the UI more easily. However, it does not use this name in the scheduler. Instead it is using the path to the main class Word2VecOnCluster
:
The name is only present in the title:
A colleague of mine is actually doing the same and there it works. What you cannot see here is that the name of my task is a little larger:
W2V_rtype-yelp_w2vpart-1_vsize-100_lr-0.025_dskeep-5.0perc_vocabsize-100000_epochs-1_iter-1
So could it be that there is a limit regarding the length of the name? If so then it might should be added to the documentation - or is there any other reason why it would do that?
When submitting the application in cluster mode, the name which is set inside the sparkConf will not be picked up because by then the app has already started. You can pass --name {appName} to the spark-submit command to show that name in the Yarn resource manager.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With