This is probably a very basic question so please pardon the ignorance.
I understand there are two metastores that hive will use in a out of the box (hive tar.bin extract) vanilla setup. In my case I have hive 0.14.
There is one in a derby database--with a default folder name called metastore_db outside of hdfs.
And there is another in hdfs at /user/hive/warehouse.
What is the e difference between these two?
In Hive, Metastore constitutes of (1) the meta store service and (2) the database.
Metastore DB - is any JDBC complaint RDBMS database, in which it stores schema and partition details for both managed and external tables. This can be used by other applications such as Impala, to get tables and schema details from it. As name suggests, it only stores meta data.
Metastore Service - Hive also runs a separate service called metastore service to manage the metastore data like, stores the metadata for Hive tables and partitions in a Metastore DB, and provides clients (including Hive) access to this information via the metastore service API.
Warehouse - Hive data is stored in HDFS, normally under /user/hive/warehouse (or any path you specify as hive.metastore.warehouse.dir in your hive-site.xml ).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With