Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Missing application resource while running script in pyspark

I have been trying to execute a script .py by pyspark but I keep getting this error:

11:55 $ ./bin/spark-submit --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar --py-files example.py
Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
    at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
    at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:160)
    at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:276)
    at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:151)
    at org.apache.spark.launcher.Main.main(Main.java:86)

I can easily execute it by doing this:

 11:57 $  pyspark --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar

then paste the code block by block in the IPython (interactive shell) . but I want to put the script in a cronjob so that It can be executed automatically. I need a command to put in cronjob and the spark-submit is not working. Any ideas?

like image 861
Souad Avatar asked May 11 '17 10:05

Souad


1 Answers

you need to put the python file at the end again.

./bin/spark-submit --jars spark-cassandra-connector-2.0.0-M2-s_2.11.jar --py-files example.py example.py
like image 52
X.Zhao Avatar answered Sep 22 '22 03:09

X.Zhao