I'm trying to run the spark shell on my Hadoop cluster via Yarn. I use
My Hadoop cluster already works. In order to use Spark, I built Spark as described here :
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.1 -DskipTests clean package
The compilation works fine, and I can run spark-shell
without troubles. However, running it on yarn :
spark-shell --master yarn-client
gets me the following error :
14/07/07 11:30:32 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1404725422955
yarnAppState: ACCEPTED
14/07/07 11:30:33 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1404725422955
yarnAppState: FAILED
org.apache.spark.SparkException: Yarn application already ended,might be killed or not able to launch application master
.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApp(YarnClientSchedulerBackend.scala:105
)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:82)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:136)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:318)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:957)
at $iwC$$iwC.<init>(<console>:8)
at $iwC.<init>(<console>:14)
at <init>(<console>:16)
at .<init>(<console>:20)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:121)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:120)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:263)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:120)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:56)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:913)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:142)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:56)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:104)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:56)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:930)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Spark manages to communicate with my cluster, but it doesn't work out.
Another interesting thing is that I can access my cluster using pyspark --master yarn
. However, I get the following warning
14/07/07 14:10:11 WARN cluster.YarnClientClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
and an infinite computation time when doing something as simple as
sc.wholeTextFiles('hdfs://vm7x64.fr/').collect()
What may be causing this problem ?
If it says yarn - it's running on YARN... if it shows a URL of the form spark://... it's a standalone cluster. This might seem like a silly question, but if I run yarn application -list and my process ID is in the output, then this must mean that it's running in yarn mode, right?
Launching Spark on YARNEnsure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster. These configs are used to write to HDFS and connect to the YARN ResourceManager.
Please check does your Hadoop cluster is running correctly. On the master node next YARN process must be running:
$ jps
24970 ResourceManager
On slave nodes/executors:
$ jps
14389 NodeManager
Also make sure that you created a reference (or copied those files) to Hadoop configuration in Spark config directory :
$ ll /spark/conf/ | grep site
lrwxrwxrwx 1 hadoop hadoop 33 Jun 8 18:13 core-site.xml -> /hadoop/etc/hadoop/core-site.xml
lrwxrwxrwx 1 hadoop hadoop 33 Jun 8 18:13 hdfs-site.xml -> /hadoop/etc/hadoop/hdfs-site.xml
You also can check ResourceManager Web UI on port 8088 - http://master:8088/cluster/nodes. There must be a list of available nodes and resources.
You must take a look at your log files using next command (application ID you can find in Web UI):
$ yarn logs -applicationId <yourApplicationId>
Or you can look directly to entire log files on Master/ResourceManager host:
$ ll /hadoop/logs/ | grep resourcemanager
-rw-rw-r-- 1 hadoop hadoop 368414 Jun 12 18:12 yarn-hadoop-resourcemanager-master.log
-rw-rw-r-- 1 hadoop hadoop 2632 Jun 12 17:52 yarn-hadoop-resourcemanager-master.out
And on Slave/NodeManager hosts:
$ ll /hadoop/logs/ | grep nodemanager
-rw-rw-r-- 1 hadoop hadoop 284134 Jun 12 18:12 yarn-hadoop-nodemanager-slave.log
-rw-rw-r-- 1 hadoop hadoop 702 Jun 9 14:47 yarn-hadoop-nodemanager-slave.out
Also check if all environment variables are correct:
HADOOP_CONF_LIB_NATIVE_DIR=/hadoop/lib/native
HADOOP_MAPRED_HOME=/hadoop
HADOOP_COMMON_HOME=/hadoop
HADOOP_HDFS_HOME=/hadoop
YARN_HOME=/hadoop
HADOOP_INSTALL=/hadoop
HADOOP_CONF_DIR=/hadoop/etc/hadoop
YARN_CONF_DIR=/hadoop/etc/hadoop
SPARK_HOME=/spark
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With