How to Execute sql queries in Apache Spark

Tags:

I am very new to Apache Spark.
I have already configured spark 2.0.2 on my local windows machine. I have done with "word count" example with spark.
Now, I have the problem in executing the SQL Queries. I have searched for the same , but not getting proper guidance .

606

asked Nov 28 '16 10:11

rajkumar chilukuri

1 Answers

So you need to do these things to get it done ,

In Spark 2.0.2 we have SparkSession which contains SparkContext instance as well as sqlContext instance.

Hence the steps would be :

Step 1: Create SparkSession

val spark = SparkSession.builder().appName("MyApp").master("local[*]").getOrCreate()

Step 2: Load from the database in your case Mysql.

val loadedData=spark
      .read
      .format("jdbc")
      .option("url", "jdbc:mysql://localhost:3306/mydatabase")
      .option("driver", "com.mysql.jdbc.Driver")
      .option("mytable", "mydatabase")
      .option("user", "root")
      .option("password", "toor")
      .load().createOrReplaceTempView("mytable")

Step 3: Now you can run your SqlQuery just like you do in SqlDatabase.

val dataFrame=spark.sql("Select * from mytable")
dataFrame.show()

P.S: It would be better if you use DataFrame Api's or even better if DataSet Api's , but for those you need to go through the documentation.

Link to Documentation: https://spark.apache.org/docs/2.0.0/api/scala/index.html#org.apache.spark.sql.Dataset

123

answered Sep 21 '22 18:09

Shivansh

Related questions
                            
                                SQL Comparing values in two rows
                            
                                To use multiple with statement by union all
                            
                                Remove newlines in oracle sql
                            
                                PostgreSQL ltree find all ancestors of a given label (not path)
                            
                                Cassandra: Only EQ and IN relation are supported on the partition key (unless you use the token() function)
                            
                                MySQL - Left Join, Select all columns left, few columns on the right tables,
                            
                                Updating multiple rows using list of Ids
                            
                                How to join two tables based on substring values of fields?
                            
                                REPLACE empty string
                            
                                SQL Server FOR XML Path make repeating nodes
                            
                                "Operator does not exist: integer =?" when using Postgres
                            
                                Left join without multiple rows from right table
                            
                                Using SqlQuery<Dictionary<string, string>> in Entity Framework 6
                            
                                How can I get a count of a bit-type column?
                            
                                SQL schema pattern for keeping history of changes
                            
                                Is it possible to update rows from a key/value pair?
                            
                                Display value of attribute of type SDO_GEOMETRY
                            
                                SQL Convert Week Number to Date (dd/MM)
                            
                                Similar function to SQL 'WHERE' clause in R
                            
                                CTE and table update in ORACLE

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to Execute sql queries in Apache Spark

Tags:

sql

apache-spark

rajkumar chilukuri

People also ask

1 Answers

Shivansh

Recent Activity

Donate For Us