Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is spark HiveContext/SQLContext retrieving schema/data?

I can't seem to find much documentation on it but when I pull data from Hive in Spark SQL how is it retrieving the schema, is it automatically looking in the Hive Metastore? Also is it Hive telling spark to look at the file location to pull the data into a DataFrame? And how does it handle a view or can it not handle a view yet?

like image 563
theMadKing Avatar asked Nov 19 '25 22:11

theMadKing


1 Answers

  1. Yes, it looks up hive metastore.
  2. Spark delegates hive queries to hive. It captures output and turn it to a dataframe of rows. From docs:

When working with Hive one must construct a HiveContext, which inherits from SQLContext, and adds support for finding tables in the MetaStore and writing queries using HiveQL

like image 135
ayan guha Avatar answered Nov 23 '25 00:11

ayan guha