"Bad substitution" when submitting spark job to yarn-cluster

Tags:

apache-spark

hadoop-yarn

I am doing a smoke test against a yarn cluster using yarn-cluster as the master with the SparkPi example program. Here is the command line:

  $SPARK_HOME/bin/spark-submit --master yarn-cluster 
 --executor-memory 8G --executor-cores 240 --class org.apache.spark.examples.SparkPi

examples/target/scala-2.11/spark-examples-1.4.1-hadoop2.7.1.jar

The yarn accepts the job but then complains about a "bad substitution". Maybe it is on the hdp.version ??

15/09/01 21:54:05 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:05 INFO yarn.Client:
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1441144443866
     final status: UNDEFINED
     tracking URL: http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/proxy/application_1441066518301_0013/
     user: stack
15/09/01 21:54:06 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:10 INFO yarn.Client: Application report for application_1441066518301_0013 (state: FAILED)
15/09/01 21:54:10 INFO yarn.Client:
     client token: N/A
     diagnostics: Application application_1441066518301_0013 failed 2 times due to AM Container for appattempt_1441066518301_0013_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/cluster/app/application_1441066518301_0013Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e03_1441066518301_0013_02_000001
Exit code: 1
Exception message: /mnt/yarn/nm/local/usercache/stack/appcache/
application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/
launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:
/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:
/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution

Stack trace: ExitCodeException exitCode=1: /mnt/yarn/nm/local/usercache/stack/appcache/application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Of note here is:

/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution

The "sh" is linked to bash:

$ ll /bin/sh
lrwxrwxrwx 1 root root 4 Sep  1 05:48 /bin/sh -> bash

705

asked Sep 01 '15 22:09

WestCoastProjects

2 Answers

It is caused by hdp.version not getting substituted correctly. You have to set hdp.version in the file java-opts under $SPARK_HOME/conf.

And you have to set

spark.driver.extraJavaOptions -Dhdp.version=XXX 
spark.yarn.am.extraJavaOptions -Dhdp.version=XXX

in spark-defaults.conf under $SPARK_HOME/conf where XXX is the version of hdp.

141

answered Oct 21 '22 20:10

zhang zhan

If you are using spark with hdp, then you have to do the following things:

Add these entries in $SPARK_HOME/conf/spark-defaults.conf

spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

Create a file called java-opts in $SPARK_HOME/conf and add the installed HDP version to that file like this:

-Dhdp.version=2.2.0.0-2041 (your installed HDP version)

To figure out which hdp version is installed, please run this command in the cluster:

hdp-select status hadoop-client

answered Oct 21 '22 20:10

Sudarsan

Related questions
                            
                                Spark Data frame search column starting with a string
                            
                                how to introduce the schema in a Row in Spark?
                            
                                Spark Twitter Streaming exception : (org.apache.spark.Logging) classnotfound
                            
                                pyspark convert dataframe column from timestamp to string of "YYYY-MM-DD" format
                            
                                Filter based on another RDD in Spark
                            
                                How to make the first row as header when reading a file in PySpark and converting it to Pandas Dataframe
                            
                                Exception in thread "main" java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)
                            
                                SBT assembly jar exclusion
                            
                                How to specify the path where saveAsTable saves files to?
                            
                                terminating a spark step in aws
                            
                                How to reverse ordering for RDD.takeOrdered()?
                            
                                Aggregate function in spark-sql not found
                            
                                Python worker failed to connect back
                            
                                NullPointerException in Scala Spark, appears to be caused be collection type?
                            
                                Spark com.fasterxml.jackson.module error
                            
                                How to count number of columns in Spark Dataframe?
                            
                                Upload zip file using --archives option of spark-submit on yarn
                            
                                Removing empty strings from maps in scala
                            
                                idea sbt java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
                            
                                How to construct Dataframe from a Excel (xls,xlsx) file in Scala Spark?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With