Spark-submit not working when application jar is in hdfs

Tags:

I'm trying to run a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, it works. However, when I copied my application jar to a directory in hdfs, i get the following exception:

Warning: Skip remote jar hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar. java.lang.ClassNotFoundException: com.example.SimpleApp

Here's the command:

$ ./bin/spark-submit --class com.example.SimpleApp --master local hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar

I'm using hadoop version 2.6.0, spark version 1.2.1

693

asked Feb 26 '15 10:02

dilm

1 Answers

The only way it worked for me, when I was using

--master yarn-cluster

163

answered Sep 30 '22 10:09

Romain

Related questions
                            
                                Apache Hadoop Yarn - Underutilization of cores
                            
                                What is the purpose of "uber mode" in hadoop?
                            
                                Find port number where HDFS is listening
                            
                                Is there an equivalent to `pwd` in hdfs?
                            
                                how to replace characters in hive?
                            
                                Pyspark: get list of files/directories on HDFS path
                            
                                No such method exception Hadoop <init>
                            
                                Accessing stream output from hdfs of MRjob
                            
                                Add a column in a table in HIVE QL
                            
                                Difference between `hadoop dfs` and `hadoop fs` [closed]
                            
                                How to convert .txt file to Hadoop's sequence file format
                            
                                Hadoop speculative task execution
                            
                                Select top 2 rows in Hive
                            
                                apache spark - check if file exists
                            
                                Why do I need to source bash_profile every time
                            
                                Would Spark unpersist the RDD itself when it realizes it won't be used anymore?
                            
                                Alter hive table add or drop column
                            
                                Merging multiple files into one within Hadoop
                            
                                Hive query to quickly find table size (number of rows)
                            
                                No data nodes are started

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark-submit not working when application jar is in hdfs

Tags:

apache-spark

hadoop

hdfs

dilm

People also ask

1 Answers

Romain

Recent Activity

Donate For Us