Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error: Must specify a primary resource (JAR or Python file) - Spark scala

I am trying to run a simple twitter sentiment analysis code that was running fine upto now but I don't know what changed which is giving me this error. My command line has all the required parameters in place including --class --master --jars etc. The only thing I did different was run the sudo apt-get install 7-jdk command and that updated the java version. I am running spark 1.3.1 so this java updation should not be a problem...I think. Now even when I run commands like sbt assembly or sbt run I get an error saying assembly is not a know command.

Here's my command line:

./bin/spark-submit --class Sentimenter --master local[4] --jars /home/ubuntu/spark/spark-example-master/target/scala-2.10/Sentiment_Analysis-assembly-1.0.jar

And here's the output I get:

Error: Must specify a primary resource (JAR or Python file) Run with --help for usage help or --verbose for debug output Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

Any suggestions would be great!

like image 730
serendipity Avatar asked Jan 07 '23 23:01

serendipity


1 Answers

From spark-submit --help:

Usage: spark-submit [options] <app jar | python file> [app arguments]

You need to add a jar or python file on the command line, after options. So, in your example, you'd need to do

./bin/spark-submit --class Sentimenter --master local[4] /home/ubuntu/spark/spark-example-master/target/scala-2.10/Sentiment_Analysis-assembly-1.0.jar

Note that I removed --jars. You can do that to add additional jars (dependencies). If all you have is one jar, that's the primary resource and goes on the command line without any --jars option.

like image 128
Iulian Dragos Avatar answered Jan 10 '23 19:01

Iulian Dragos