Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

where clause not working in spark sql dataframe

Tags:

I've created a dataframe which contains 3 columns : zip, lat, lng

I want to select the lat and lng values where zip = 00650

So, I tried using :

sqlContext.sql("select lat,lng from census where zip=00650").show()

But it is returning ArrayOutOfBound Exception because it does not have any values in it. If I remove the where clause it is running fine.

Can someone please explain what I am doing wrong ?

Update:

dataframe schema:

root 
|-- zip: string (nullable = true) 
|-- lat: string (nullable = true) 
|-- lng: string (nullable = true)

First 10 rows are :

+-----+---------+-----------+
|  zip|      lat|        lng|
+-----+---------+-----------+
|00601|18.180555| -66.749961|
|00602|18.361945| -67.175597|
|00603|18.455183| -67.119887|
|00606|18.158345| -66.932911|
|00610|18.295366| -67.125135|
|00612|18.402253| -66.711397|
|00616|18.420412| -66.671979|
|00617|18.445147| -66.559696|
|00622|17.991245| -67.153993|
|00623|18.083361| -67.153897|
|00624|18.064919| -66.716683|
|00627|18.412600| -66.863926|
|00631|18.190607| -66.832041|
|00637|18.076713| -66.947389|
|00638|18.295913| -66.515588|
|00641|18.263085| -66.712985|
|00646|18.433150| -66.285875| 
|00647|17.963613| -66.947127|
|00650|18.349416| -66.578079|
like image 250
Ishan Avatar asked Feb 23 '17 07:02

Ishan


1 Answers

As you can see in your schema zip is of type String, so your query should be something like this

sqlContext.sql("select lat, lng from census where zip = '00650'").show()

Update:

If you are using Spark 2 then you can do this:

import sparkSession.sqlContext.implicits._

val dataFrame = Seq(("10.023", "75.0125", "00650"),("12.0246", "76.4586", "00650"), ("10.023", "75.0125", "00651")).toDF("lat","lng", "zip")

dataFrame.printSchema()

dataFrame.select("*").where(dataFrame("zip") === "00650").show()

dataFrame.registerTempTable("census")

sparkSession.sqlContext.sql("SELECT lat, lng FROM census WHERE zip = '00650'").show()

output:

root
 |-- lat: string (nullable = true)
 |-- lng: string (nullable = true)
 |-- zip: string (nullable = true)

+-------+-------+-----+
|    lat|    lng|  zip|
+-------+-------+-----+
| 10.023|75.0125|00650|
|12.0246|76.4586|00650|
+-------+-------+-----+

+-------+-------+
|    lat|    lng|
+-------+-------+
| 10.023|75.0125|
|12.0246|76.4586|
+-------+-------+
like image 196
Prasad Khode Avatar answered Sep 23 '22 09:09

Prasad Khode