Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does the Hive data gets stored?

I am a little confused on where does the hive stores it's data.

Does it stores it's data in HDFS or in a RDBMS ?? Does Hive Meta store uses a RDBMS to store the hive tables metadata ??

Thanks in Advance !!

like image 974
Naman Agarwal Avatar asked Apr 27 '17 12:04

Naman Agarwal


People also ask

Is hive a database or data warehouse?

Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarise Big Data and makes querying and analyzing easy.

Where is HDFS data stored?

How Does HDFS Store Data? HDFS divides files into blocks and stores each block on a DataNode. Multiple DataNodes are linked to the master node in the cluster, the NameNode. The master node distributes replicas of these data blocks across the cluster.


1 Answers

Hive data are stored in one of Hadoop compatible filesystem: S3, HDFS or other compatible filesystem.

Hive metadata are stored in RDBMS like MySQL, see supported RDBMS.

The location of Hive tables data in S3 or HDFS can be specified for both managed and external tables.

The difference between managed and external tables is that DROP TABLE statement, in managed table, will drop the table and delete table's data. Whereas, for external table DROP TABLE will drop only the table and data will remain as is and can be used for creating other tables over it.

See details here: Create/Drop/Truncate Table

like image 140
leftjoin Avatar answered Sep 30 '22 15:09

leftjoin