Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return Temporary Spark SQL Table in Scala

First I convert a CSV file to a Spark DataFrame using

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("/usr/people.csv")

after that type df and return I can see

res30: org.apache.spark.sql.DataFrame = [name: string, age: string, gender: string, deptID: string, salary: string]

Then I use df.registerTempTable("people") to convert df to a Spark SQL table.

But after that when I do people Instead got type table, I got

<console>:33: error: not found: value people

Is it because people is a temporary table?

Thanks

like image 631
Gavin Niu Avatar asked Dec 29 '25 01:12

Gavin Niu


1 Answers

When you register an temp table using the registerTempTable command you used, it will be available inside your SQLContext.

This means that the following is incorrect and will give you the error you are getting :

scala> people.show
<console>:33: error: not found: value people

To use the temp table, you'll need to call it with your sqlContext. Example :

scala> sqlContext.sql("select * from people")

Note : df.registerTempTable("df") will register a temporary table with name df correspond to the DataFrame df you apply the method on.

So persisting on df wont persist the table but the DataFrame, even thought the SQLContext will be using that DataFrame.

like image 180
eliasah Avatar answered Dec 31 '25 16:12

eliasah



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!