Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark-submit for a .scala file

I have been running some test spark scala code using probably a bad way of doing things with spark-shell:

spark-shell --conf spark.neo4j.bolt.password=Stuffffit --packages neo4j-contrib:neo4j-spark-connector:2.0.0-M2,graphframes:graphframes:0.2.0-spark2.0-s_2.11 -i neo4jsparkCluster.scala 

This would execute my code on spark and pop into the shell when done.

Now that I am trying to run this on a cluster, I think I need to use spark-submit, to which I thought would be:

spark-submit --conf spark.neo4j.bolt.password=Stuffffit --packages neo4j-contrib:neo4j-spark-connector:2.0.0-M2,graphframes:graphframes:0.2.0-spark2.0-s_2.11 -i neo4jsparkCluster.scala 

but it does not like the .scala file, somehow does it have to be compiled into a class? the scala code is a simple scala file with several helper classes defined in it and no real main class so to speak. I don't see int he help files but maybe I am missing it, can I just spark-submit a file or do I have to somehow give it the class? Thus changing my scala code?

I did add this to my scala code too:

went from this

val conf = new SparkConf.setMaster("local").setAppName("neo4jspark")


val sc = new SparkContext(conf)  

To this:

val sc = new SparkContext(new SparkConf().setMaster("spark://192.20.0.71:7077")
like image 366
Codejoy Avatar asked Dec 05 '17 22:12

Codejoy


People also ask

How do I run a .scala file?

Executing Scala code as a script Another way to execute Scala code is to type it into a text file and save it with a name ending with “. scala”. We can then execute that code by typing “scala filename”. For instance, we can create a file named hello.


2 Answers

There are 2 quick and dirty ways of doing this:

  1. Without modifying the scala file

Simply use the spark shell with the -i flag:

$SPARK_HOME/bin/spark-shell -i neo4jsparkCluster.scala

  1. Modifying the scala file to include a main method

a. Compile:

scalac -classpath <location of spark jars on your machine> neo4jsparkCluster

b. Submit it to your cluster:

/usr/lib/spark/bin/spark-submit --class <qualified class name> --master <> .

like image 114
shridharama Avatar answered Sep 28 '22 14:09

shridharama


You can take a look at the following Hello World example for Spark which packages your application as @zachdb86 already mentioned.

spark-hello-world

like image 37
Zouzias Avatar answered Sep 28 '22 13:09

Zouzias