I am a spark/yarn newbie, run into exitCode=13 when I submit a spark job on yarn cluster. When the spark job is running in local mode, everything is fine.
The command I used is:
/usr/hdp/current/spark-client/bin/spark-submit --class com.test.sparkTest --master yarn --deploy-mode cluster --num-executors 40 --executor-cores 4 --driver-memory 17g --executor-memory 22g --files /usr/hdp/current/spark-client/conf/hive-site.xml /home/user/sparkTest.jar*
Spark Error Log:
16/04/12 17:59:30 INFO Client: client token: N/A diagnostics: Application application_1459460037715_23007 failed 2 times due to AM Container for appattempt_1459460037715_23007_000002 exited with exitCode: 13 For more detailed output, check application tracking page:http://b-r06f2-prod.phx2.cpe.net:8088/cluster/app/application_1459460037715_23007Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e40_1459460037715_23007_02_000001 Exit code: 13 Stack trace: ExitCodeException exitCode=13: at org.apache.hadoop.util.Shell.runCommand(Shell.java:576) at org.apache.hadoop.util.Shell.run(Shell.java:487) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) **Yarn logs** 16/04/12 23:55:35 INFO mapreduce.TableInputFormatBase: Input split length: 977 M bytes. 16/04/12 23:55:41 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:55:51 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:56:01 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:56:11 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:56:11 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x152f0b4fc0e7488 16/04/12 23:56:11 INFO zookeeper.ZooKeeper: Session: 0x152f0b4fc0e7488 closed 16/04/12 23:56:11 INFO zookeeper.ClientCnxn: EventThread shut down 16/04/12 23:56:11 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 2). 2003 bytes result sent to driver 16/04/12 23:56:11 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 82134 ms on localhost (2/3) 16/04/12 23:56:17 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x4508c270df0980316/04/12 23:56:17 INFO zookeeper.ZooKeeper: Session: 0x4508c270df09803 closed * ... 16/04/12 23:56:21 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application. 16/04/12 23:56:21 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Timed out waiting for SparkContext.) 16/04/12 23:56:21 INFO spark.SparkContext: Invoking stop() from shutdown hook *
Spark on YARN Spark uses two key components – a distributed file storage system, and a scheduler to manage workloads. Typically, Spark would be run with HDFS for storage, and with either YARN (Yet Another Resource Manager) or Mesos, two of the most common resource managers.
Spark supports two modes for running on YARN, “yarn-cluster” mode and “yarn-client” mode.
It seems that you have set the master in your code to be local
SparkConf.setMaster("local[*]")
You have to let the master unset in the code, and set it later when you issue spark-submit
spark-submit --master yarn-client ...
If it helps someone
Another possibility of this error is when you put incorrectly the --class param
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With