Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is imported with spark.implicits._?

Tags:

apache-spark

What is imported with import spark.implicits._? Does "implicits" refer to some package? If so, why could I not find it in the Scala Api documentation on https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package?

like image 671
Joe Avatar asked Jan 02 '23 06:01

Joe


2 Answers

Scala allows you to import "dynamically" things into scope. You can also do something like that:

final case class Greeting(hi: String)

def greet(greeting: Greeting): Unit = {
  import greeting._ // everything in greeting is now available in scope
  println(hi)
}

The SparkSession instance carries along some implicits that you import in your scope with that import statement. The most important thing that you get are the Encoders necessary for a lot of operations on DataFrames and Datasets. It also brings into the scope the StringContext necessary for you to use the $"column_name" notation.

The implicits member is an instance of SQLImplicits, whose source code (for version 2.3.1) you can view here.

like image 171
stefanobaghino Avatar answered Feb 15 '23 16:02

stefanobaghino


It's scala's feature to import through object, so the api documentation did not describe about it. From Apache spark source code, implicits is an object class inside SparkSession class. The implicits class has extended the SQLImplicits like this : object implicits extends org.apache.spark.sql.SQLImplicits with scala.Serializable. The SQLImplicits provides some more functionalities like:

  1. Convert scala object to dataset. (By toDS)
  2. Convert scala object to dataframe. (By toDF)
  3. Convert "$name" into Column.

By importing implicits through import spark.implicits._ where spark is a SparkSession type object, the functionalities are imported implicitly.

like image 30
Md Shihab Uddin Avatar answered Feb 15 '23 15:02

Md Shihab Uddin