Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark Scala : Unable to import sqlContext.implicits._

I tried the below code and cannot import sqlContext.implicits._ - it throws an error (in the Scala IDE), unable to build the code:

value implicits is not a member of org.apache.spark.sql.SQLContext

Do I need to add any dependencies in pom.xml?

Spark version 1.5.2

package com.Spark.ConnectToHadoop

import org.apache.spark.SparkConf
import org.apache.spark.SparkConf
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.SQLContext
import org.apache.spark.rdd.RDD
//import groovy.sql.Sql.CreateStatementCommand

//import org.apache.spark.SparkConf


object CountWords  {

  def main(args:Array[String]){

    val objConf = new SparkConf().setAppName("Spark Connection").setMaster("spark://IP:7077")
    var sc = new SparkContext(objConf)
val objHiveContext = new HiveContext(sc)
objHiveContext.sql("USE test")
var rdd= objHiveContext.sql("select * from Table1")
val options=Map("path" -> "hdfs://URL/apps/hive/warehouse/test.db/TableName")
//val sqlContext = new org.apache.spark.sql.SQLContext(sc)
   val sqlContext = new SQLContext(sc)
    import sqlContext.implicits._      //Error
val dataframe = rdd.toDF()
dataframe.write.format("orc").options(options).mode(SaveMode.Overwrite).saveAsTable("TableName")      
  }
}

My pom.xml file is as follows

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.Sudhir.Maven1</groupId>
  <artifactId>SparkDemo</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>SparkDemo</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.10</artifactId>
      <version>1.5.2</version>
    </dependency> 
    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>1.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-mllib_2.10</artifactId>
    <version>1.5.2</version>
</dependency>

<dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.10</artifactId>
      <version>0.9.1</version>
    </dependency>

     <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.10</artifactId>
        <version>1.2.1</version>
    </dependency>
  <dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>1.2.1</version>
</dependency>

    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>     

  </dependencies>
</project>
like image 377
sudhir Avatar asked Jan 18 '16 12:01

sudhir


People also ask

What does import spark Implicits _ do?

implicits object gives implicit conversions for converting Scala objects (incl. RDDs) into a Dataset , DataFrame , Columns or supporting such conversions (through Encoders). Creates a DatasetHolder with the input Seq[T] converted to a Dataset[T] (using SparkSession. createDataset.

How do I get SQLContext in spark-shell?

You have to create sqlContext in order to access it to execute SQL statements. In Spark 2.0, you can create SQLContext easily using SparkSession like below. Alternatively, you could also execute SQL statements using SparkSession like below.

What is spark SQLContext?

SQLContext is a class and is used for initializing the functionalities of Spark SQL. SparkContext class object (sc) is required for initializing SQLContext class object. The following command is used for initializing the SparkContext through spark-shell. $ spark-shell.

What is SC SQLContext?

// sc is an existing SparkContext. val sqlContext = new org. sql. SQLContext(sc) // this is used to implicitly convert an RDD to a DataFrame. import sqlContext.implicits._ // Define the schema using a case class.


4 Answers

With the release of Spark 2.0.0 (July 26, 2016) one should now use the following:

import spark.implicits._  // spark = SparkSession.builder().getOrCreate()

https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html

like image 170
Marsellus Wallace Avatar answered Oct 10 '22 23:10

Marsellus Wallace


first create

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

now we have sqlContext w.r.t sc (this will be available automatically when we launch spark-shell) now,

import sqlContext.implicits._ 
like image 33
kasi viswanath Avatar answered Oct 10 '22 21:10

kasi viswanath


You use an old version of Spark-SQL. Change it to:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>1.5.2</version>
</dependency>
like image 37
TheMP Avatar answered Oct 10 '22 21:10

TheMP


For someone using sbt to build, update the library versions to

libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-core_2.12" % "2.4.6" % "provided",
  "org.apache.spark" % "spark-sql_2.12" % "2.4.6" % "provided"
)

And then import SqlImplicits as below.

val spark = SparkSession.builder()
      .appName("appName")
      .getOrCreate()

    import spark.sqlContext.implicits._;
like image 35
Gayan Kavirathne Avatar answered Oct 10 '22 22:10

Gayan Kavirathne