I tried the below code and cannot import sqlContext.implicits._
- it throws an error (in the Scala IDE), unable to build the code:
value implicits is not a member of org.apache.spark.sql.SQLContext
Do I need to add any dependencies in pom.xml
?
Spark version 1.5.2
package com.Spark.ConnectToHadoop
import org.apache.spark.SparkConf
import org.apache.spark.SparkConf
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.SQLContext
import org.apache.spark.rdd.RDD
//import groovy.sql.Sql.CreateStatementCommand
//import org.apache.spark.SparkConf
object CountWords {
def main(args:Array[String]){
val objConf = new SparkConf().setAppName("Spark Connection").setMaster("spark://IP:7077")
var sc = new SparkContext(objConf)
val objHiveContext = new HiveContext(sc)
objHiveContext.sql("USE test")
var rdd= objHiveContext.sql("select * from Table1")
val options=Map("path" -> "hdfs://URL/apps/hive/warehouse/test.db/TableName")
//val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._ //Error
val dataframe = rdd.toDF()
dataframe.write.format("orc").options(options).mode(SaveMode.Overwrite).saveAsTable("TableName")
}
}
My pom.xml file is as follows
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.Sudhir.Maven1</groupId>
<artifactId>SparkDemo</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>SparkDemo</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>0.9.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>
implicits object gives implicit conversions for converting Scala objects (incl. RDDs) into a Dataset , DataFrame , Columns or supporting such conversions (through Encoders). Creates a DatasetHolder with the input Seq[T] converted to a Dataset[T] (using SparkSession. createDataset.
You have to create sqlContext in order to access it to execute SQL statements. In Spark 2.0, you can create SQLContext easily using SparkSession like below. Alternatively, you could also execute SQL statements using SparkSession like below.
SQLContext is a class and is used for initializing the functionalities of Spark SQL. SparkContext class object (sc) is required for initializing SQLContext class object. The following command is used for initializing the SparkContext through spark-shell. $ spark-shell.
// sc is an existing SparkContext. val sqlContext = new org. sql. SQLContext(sc) // this is used to implicitly convert an RDD to a DataFrame. import sqlContext.implicits._ // Define the schema using a case class.
With the release of Spark 2.0.0 (July 26, 2016) one should now use the following:
import spark.implicits._ // spark = SparkSession.builder().getOrCreate()
https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html
first create
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
now we have sqlContext
w.r.t sc
(this will be available automatically when we launch spark-shell)
now,
import sqlContext.implicits._
You use an old version of Spark-SQL. Change it to:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.5.2</version>
</dependency>
For someone using sbt to build, update the library versions to
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.12" % "2.4.6" % "provided",
"org.apache.spark" % "spark-sql_2.12" % "2.4.6" % "provided"
)
And then import SqlImplicits as below.
val spark = SparkSession.builder()
.appName("appName")
.getOrCreate()
import spark.sqlContext.implicits._;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With