Using Spark 2.0, Im seeing that it is possible to turn a dataframe of row's into a dataframe of case classes. When I try to do so, Im greeted with a message stating to import spark.implicits._
. The issue that I have is that Intellij isn't recognizing that as a valid import statement, Im wondering if that has moved and the message hasn't been updated, or if I don't have the correct packages in my build settings, here is my build.sbt
libraryDependencies ++= Seq( "org.mongodb.spark" % "mongo-spark-connector_2.11" % "2.0.0-rc0", "org.apache.spark" % "spark-core_2.11" % "2.0.0", "org.apache.spark" % "spark-sql_2.11" % "2.0.0" )
implicits object gives implicit conversions for converting Scala objects (incl. RDDs) into a Dataset , DataFrame , Columns or supporting such conversions (through Encoders). Creates a DatasetHolder with the input Seq[T] converted to a Dataset[T] (using SparkSession. createDataset.
From Apache spark source code, implicits is an object class inside SparkSession class. The implicits class has extended the SQLImplicits like this : object implicits extends org. apache. spark.
SparkSession was introduced in version Spark 2.0, It is an entry point to underlying Spark functionality in order to programmatically create Spark RDD, DataFrame, and DataSet.
There is no package called spark.implicits
.
With spark
here it refers to SparkSession. If you are inside the REPL the session is already defined as spark
so you can just type:
import spark.implicits._
If you have defined your own SparkSession
somewhere in your code, then adjust it accordingly:
val mySpark = SparkSession .builder() .appName("Spark SQL basic example") .config("spark.some.config.option", "some-value") .getOrCreate() // For implicit conversions like converting RDDs to DataFrames import mySpark.implicits._
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With