I am a little confused on where does the hive stores it's data.
Does it stores it's data in HDFS or in a RDBMS ?? Does Hive Meta store uses a RDBMS to store the hive tables metadata ??
Thanks in Advance !!
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarise Big Data and makes querying and analyzing easy.
How Does HDFS Store Data? HDFS divides files into blocks and stores each block on a DataNode. Multiple DataNodes are linked to the master node in the cluster, the NameNode. The master node distributes replicas of these data blocks across the cluster.
Hive data are stored in one of Hadoop compatible filesystem: S3, HDFS or other compatible filesystem.
Hive metadata are stored in RDBMS like MySQL, see supported RDBMS.
The location of Hive tables data in S3 or HDFS can be specified for both managed and external tables.
The difference between managed and external tables is that DROP TABLE
statement, in managed table, will drop the table and delete table's data. Whereas, for external table DROP TABLE
will drop only the table and data will remain as is and can be used for creating other tables over it.
See details here: Create/Drop/Truncate Table
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With