I wrote sample spark-scala program for creating list of json elements from dataframe. when i executed with main method it returns empty list but when i executed without object that extends app it returns list that contains records. what is the difference between extends App and main method in scala object
object DfToMap {
def main(args: Array[String]): Unit = {
val spark: SparkSession = SparkSession.builder()
.appName("Rnd")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val df = Seq(
(8, "bat"),
(64, "mouse"),
(27, "horse")
).toDF("number", "word")
val json = df.toJSON
val jsonArray = new util.ArrayList[String]()
json.foreach(f => jsonArray.add(f))
print(jsonArray)
}
}
It will return empty list But following program gives me list with records
object DfToMap extends App{
val spark: SparkSession = SparkSession.builder()
.appName("Rnd")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val df = Seq(
(8, "bat"),
(64, "mouse"),
(27, "horse")
).toDF("number", "word")
val json = df.toJSON
val jsonArray = new util.ArrayList[String]()
json.foreach(f => jsonArray.add(f))
print(jsonArray)
}
Scala provides a helper class, called App, that provides the main method. Instead of writing your own main method, classes can extend the App class to produce concise and executable applications in Scala. Here is an example is shown: object Main extends App { Console.println("Hello Scala: " + (args aString ", ")) }
The main() method is Scala's code execution entry point. It is similar to the main of C, C++, and Java. As the name implies, this method is the main entry point for executing any Scala code.
App is a trait which is utilized to rapidly change objects into feasible programs, which is carried out by applying DelayedInit function and the objects inheriting the trait App uses this function to execute the entire body of the program as a section of an inherited main method.
Executing Scala code as a script Another way to execute Scala code is to type it into a text file and save it with a name ending with “. scala”. We can then execute that code by typing “scala filename”.
TL;DR Both snippets are not correct Spark programs, but one is just more incorrect than the other.
You've made two mistakes, both explained in the introductory Spark materials.
Due to it's nature Spark doesn't support applications extending App
- Quick Start - Self-Contained Applications
Note that applications should define a main() method instead of extending scala.App. Subclasses of scala.App may not work correctly.
Spark doesn't provide global shared memory therefore modifying global object is a closure is not supported - Spark Programming Guide - Understanding Closures
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With