Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark runs on Yarn cluster exitCode=13:

Tags:

I am a spark/yarn newbie, run into exitCode=13 when I submit a spark job on yarn cluster. When the spark job is running in local mode, everything is fine.

The command I used is:

/usr/hdp/current/spark-client/bin/spark-submit --class com.test.sparkTest --master yarn --deploy-mode cluster --num-executors 40 --executor-cores 4 --driver-memory 17g --executor-memory 22g --files /usr/hdp/current/spark-client/conf/hive-site.xml /home/user/sparkTest.jar* 

Spark Error Log:

16/04/12 17:59:30 INFO Client:          client token: N/A          diagnostics: Application application_1459460037715_23007 failed 2 times due to AM Container for appattempt_1459460037715_23007_000002 exited with  exitCode: 13 For more detailed output, check application tracking page:http://b-r06f2-prod.phx2.cpe.net:8088/cluster/app/application_1459460037715_23007Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e40_1459460037715_23007_02_000001 Exit code: 13 Stack trace: ExitCodeException exitCode=13:         at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)         at org.apache.hadoop.util.Shell.run(Shell.java:487)         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)         at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)   **Yarn logs**      16/04/12 23:55:35 INFO mapreduce.TableInputFormatBase: Input split length: 977 M bytes. 16/04/12 23:55:41 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:55:51 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:56:01 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:56:11 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/12 23:56:11 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x152f0b4fc0e7488 16/04/12 23:56:11 INFO zookeeper.ZooKeeper: Session: 0x152f0b4fc0e7488 closed 16/04/12 23:56:11 INFO zookeeper.ClientCnxn: EventThread shut down 16/04/12 23:56:11 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 2). 2003 bytes result sent to driver 16/04/12 23:56:11 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 82134 ms on localhost (2/3) 16/04/12 23:56:17 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x4508c270df0980316/04/12 23:56:17 INFO zookeeper.ZooKeeper: Session: 0x4508c270df09803 closed * ...     16/04/12 23:56:21 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application. 16/04/12 23:56:21 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Timed out waiting for SparkContext.) 16/04/12 23:56:21 INFO spark.SparkContext: Invoking stop() from shutdown hook * 
like image 222
user_not_found Avatar asked Apr 10 '16 20:04

user_not_found


People also ask

Does Spark use YARN?

Spark on YARN Spark uses two key components – a distributed file storage system, and a scheduler to manage workloads. Typically, Spark would be run with HDFS for storage, and with either YARN (Yet Another Resource Manager) or Mesos, two of the most common resource managers.

What are the two ways to run Spark on YARN?

Spark supports two modes for running on YARN, “yarn-cluster” mode and “yarn-client” mode.


2 Answers

It seems that you have set the master in your code to be local

SparkConf.setMaster("local[*]")

You have to let the master unset in the code, and set it later when you issue spark-submit

spark-submit --master yarn-client ...

like image 140
user1314742 Avatar answered Sep 17 '22 14:09

user1314742


If it helps someone

Another possibility of this error is when you put incorrectly the --class param

like image 28
Jhon Mario Lotero Avatar answered Sep 20 '22 14:09

Jhon Mario Lotero