Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark without Hadoop: Failed to Launch

I'm running Spark 2.1.0, Hive 2.1.1 and Hadoop 2.7.3 on Ubuntu 16.04.

I download the Spark project from github and build the "without hadoop" version:

./dev/make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided"

When I run ./sbin/start-master.sh, I get the following exception:

 Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp /home/server/spark/conf/:/home/server/spark/jars/*:/home/server/hadoop/etc/hadoop/:/home/server/hadoop/share/hadoop/common/lib/:/home/server/hadoop/share/hadoop/common/:/home/server/hadoop/share/hadoop/mapreduce/:/home/server/hadoop/share/hadoop/mapreduce/lib/:/home/server/hadoop/share/hadoop/yarn/:/home/server/hadoop/share/hadoop/yarn/lib/ -Xmx1g org.apache.spark.deploy.master.Master --host ThinkPad-W550s-Lab --port 7077 --webui-port 8080
 ========================================
 Error: A JNI error has occurred, please check your installation and try again
 Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
     at java.lang.Class.getDeclaredMethods0(Native Method)
     at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
     at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
     at java.lang.Class.getMethod0(Class.java:3018)
     at java.lang.Class.getMethod(Class.java:1784)
     at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
     at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
 Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
     at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
     ... 7 more

I edit SPARK_DIST_CLASSPATH according to the post Where are hadoop jar files in hadoop 2?

export SPARK_DIST_CLASSPATH=~/hadoop/share/hadoop/common/lib:~/hadoop/share/hadoop/common:~/hadoop/share/hadoop/mapreduce:~/hadoop/share/hadoop/mapreduce/lib:~/hadoop/share/hadoop/yarn:~/hadoop/share/hadoop/yarn/lib

But I'm still getting the same error. I can see the slf4j jar file is under ~/hadoop/share/hadoop/common/lib.

How could I fix this error?

Thank you!

like image 527
Top.Deck Avatar asked Feb 17 '17 21:02

Top.Deck


People also ask

Can Spark work without Hadoop?

You can Run Spark without Hadoop in Standalone Mode Spark and Hadoop are better together Hadoop is not essential to run Spark. If you go by Spark documentation, it is mentioned that there is no need for Hadoop if you run Spark in a standalone mode. In this case, you need resource managers like CanN or Mesos only.

Does Spark run Hadoop?

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat.

Does Spark need storage?

By default , Spark does not have storage mechanism. To store data, it needs fast and scalable file system. You can use S3 or HDFS or any other file system.


1 Answers

“Hadoop free” builds need to modify SPARK_DIST_CLASSPATH to include Hadoop’s package jars.

The most convenient place to do this is by adding an entry in conf/spark-env.sh :

export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)  

check this https://spark.apache.org/docs/latest/hadoop-provided.html

like image 200
yuxh Avatar answered Oct 02 '22 08:10

yuxh