Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does spark-shell --master yarn-client fail (yet pyspark --master yarn seems to work)?

I'm trying to run the spark shell on my Hadoop cluster via Yarn. I use

  • Hadoop 2.4.1
  • Spark 1.0.0

My Hadoop cluster already works. In order to use Spark, I built Spark as described here :

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.1 -DskipTests clean package

The compilation works fine, and I can run spark-shell without troubles. However, running it on yarn :

spark-shell --master yarn-client

gets me the following error :

14/07/07 11:30:32 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1404725422955
         yarnAppState: ACCEPTED

14/07/07 11:30:33 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1404725422955
         yarnAppState: FAILED

org.apache.spark.SparkException: Yarn application already ended,might be killed or not able to launch application master
.
        at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApp(YarnClientSchedulerBackend.scala:105
)
        at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:82)
        at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:136)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:318)
        at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:957)
        at $iwC$$iwC.<init>(<console>:8)
        at $iwC.<init>(<console>:14)
        at <init>(<console>:16)
        at .<init>(<console>:20)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788)
        at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056)
        at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609)
        at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796)
        at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753)
        at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:121)
        at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:120)
        at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:263)
        at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:120)
        at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:56)
        at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:913)
        at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:142)
        at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:56)
        at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:104)
        at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:56)
        at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:930)
        at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
        at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
        at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Spark manages to communicate with my cluster, but it doesn't work out. Another interesting thing is that I can access my cluster using pyspark --master yarn. However, I get the following warning

14/07/07 14:10:11 WARN cluster.YarnClientClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

and an infinite computation time when doing something as simple as

sc.wholeTextFiles('hdfs://vm7x64.fr/').collect()

What may be causing this problem ?

like image 682
merours Avatar asked Jul 07 '14 12:07

merours


People also ask

How do you know if YARN is running on Spark?

If it says yarn - it's running on YARN... if it shows a URL of the form spark://... it's a standalone cluster. This might seem like a silly question, but if I run yarn application -list and my process ID is in the output, then this must mean that it's running in yarn mode, right?

How do you start the Spark Shell in YARN mode?

Launching Spark on YARNEnsure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster. These configs are used to write to HDFS and connect to the YARN ResourceManager.


1 Answers

Please check does your Hadoop cluster is running correctly. On the master node next YARN process must be running:

$ jps
24970 ResourceManager

On slave nodes/executors:

$ jps
14389 NodeManager

Also make sure that you created a reference (or copied those files) to Hadoop configuration in Spark config directory :

$ ll /spark/conf/ | grep site
lrwxrwxrwx  1 hadoop hadoop   33 Jun  8 18:13 core-site.xml -> /hadoop/etc/hadoop/core-site.xml
lrwxrwxrwx  1 hadoop hadoop   33 Jun  8 18:13 hdfs-site.xml -> /hadoop/etc/hadoop/hdfs-site.xml

You also can check ResourceManager Web UI on port 8088 - http://master:8088/cluster/nodes. There must be a list of available nodes and resources.

You must take a look at your log files using next command (application ID you can find in Web UI):

$ yarn logs -applicationId <yourApplicationId>

Or you can look directly to entire log files on Master/ResourceManager host:

$ ll /hadoop/logs/ | grep resourcemanager
-rw-rw-r--  1 hadoop hadoop  368414 Jun 12 18:12 yarn-hadoop-resourcemanager-master.log
-rw-rw-r--  1 hadoop hadoop    2632 Jun 12 17:52 yarn-hadoop-resourcemanager-master.out

And on Slave/NodeManager hosts:

$ ll /hadoop/logs/ | grep nodemanager
-rw-rw-r--  1 hadoop hadoop  284134 Jun 12 18:12 yarn-hadoop-nodemanager-slave.log
-rw-rw-r--  1 hadoop hadoop     702 Jun  9 14:47 yarn-hadoop-nodemanager-slave.out

Also check if all environment variables are correct:

HADOOP_CONF_LIB_NATIVE_DIR=/hadoop/lib/native
HADOOP_MAPRED_HOME=/hadoop
HADOOP_COMMON_HOME=/hadoop
HADOOP_HDFS_HOME=/hadoop
YARN_HOME=/hadoop
HADOOP_INSTALL=/hadoop
HADOOP_CONF_DIR=/hadoop/etc/hadoop
YARN_CONF_DIR=/hadoop/etc/hadoop
SPARK_HOME=/spark
like image 129
Vit D Avatar answered Sep 28 '22 00:09

Vit D