I am creating a Scala program to SQLContext
using sbt. This is my build.sbt:
name := "sampleScalaProject"
version := "1.0"
scalaVersion := "2.11.7"
//libraryDependencies += "org.apache.spark" %% "spark-core" % "2.5.2"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.5.2"
libraryDependencies += "org.apache.kafka" % "kafka_2.11" % "0.8.2.2"
libraryDependencies += "org.apache.spark" % "spark-streaming_2.11" % "1.5.2"
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "1.5.2"
libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.6.0"
And this is test program:
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
object SqlContextSparkScala {
def main (args: Array[String]) {
val sc = SparkContext
val sqlcontext = new SQLContext(sc)
}
}
I am getting below error:
Error:(8, 26) overloaded method constructor SQLContext with alternatives:
(sparkContext: org.apache.spark.api.java.JavaSparkContext)org.apache.spark.sql.SQLContext <and>
(sparkContext: org.apache.spark.SparkContext)org.apache.spark.sql.SQLContext
cannot be applied to (org.apache.spark.SparkContext.type)
val sqlcontexttest = new SQLContext(sc)
Can anybody please let me know the issue as I am very new to scala and spark programming?
You can create an SQLContext in Spark shell by passing a default SparkContext object (sc) as a parameter to the SQLContext constructor.
Create SparkSession From Scala Program To create SparkSession in Scala or Python, you need to use the builder pattern method builder() and calling getOrCreate() method. If SparkSession already exists it returns otherwise creates a new SparkSession. SparkSession. builder() – Return SparkSession.
SQLContext is the entry point to SparkSQL which is a Spark module for structured data processing. Once SQLContext is initialised, the user can then use it in order to perform various “sql-like” operations over Datasets and Dataframes.
sparkContext is a Scala implementation entry point and JavaSparkContext is a java wrapper of sparkContext. SQLContext is entry point of SparkSQL which can be received from sparkContext. Prior to 2. x.x, RDD ,DataFrame and Data-set were three different data abstractions.
scala > val sqlcontext = new org. apache. spark. sql. SQLContext ( sc) In Spark 1.0, you would need to pass a SparkContext object to a constructor in order to create SQL Context instance, In Scala, you do this as explained in the below example.
You can create an SQLContext in Spark shell by passing a default SparkContext object (sc) as a parameter to the SQLContext constructor. scala > val sqlcontext = new org. apache. spark. sql.
Full python support will be added in a future release. The entry point into all functionality in Spark SQL is the SQLContext class, or one of its descendants. To create a basic SQLContext, all you need is a SparkContext.
Internally, Spark SQL uses this extra information to perform extra optimizations. There are several ways to interact with Spark SQL including SQL, the DataFrames API and the Datasets API. When computing a result the same execution engine is used, independent of which API/language you are using to express the computation.
For newer versions of Spark (2.0+), use SparkSession
:
val spark = SparkSession.builder.getOrCreate()
SparkSession
can do everything SQLContext
can do but if needed the SQLContext
can be accessed as follows,
val sqlContext = spark.sqlContext
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With