Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is hive, Is it a database? [closed]

Tags:

hadoop

hive

hbase

I just started exploring Hive. It has all the structures similar to an RDBMS like tables, joins, partitions.. what i understand is Hive still uses HDFS for storage and it is an SQL abstraction of HDFS. From this I am not sure weather Hive itself is a database solution like HBase, Cassnadra.. or simply it is a query system on top of HDFS. I don't think it is simply a query language because it has tables, joins and partitions..

like image 345
Brainchild Avatar asked Nov 17 '13 12:11

Brainchild


People also ask

Is Hive is a database?

Hive stores its database and table metadata in a metastore, which is a database or file backed store that enables easy data abstraction and discovery.

Why Hive is not a database?

No, we cannot call Apache Hive a relational database, as it is a data warehouse which is built on top of Apache Hadoop for providing data summarization, query and, analysis. It differs from a relational database in a way that it stores schema in a database and processed data into HDFS.

Is Hive a NoSQL database?

Hive is a lightweight, NoSQL database, easy to implement and also having high benchmark on the devices and written in the pure dart.

Which database is used by Hive?

For single user metadata storage, Hive uses derby database and for multiple user Metadata or shared Metadata case Hive uses MYSQL.


1 Answers

Hive is a data warehousing package/infrastructure built on top of Hadoop. It provides an SQL dialect called Hive Query Language (HQL) for querying data stored in a Hadoop cluster. Like all SQL dialects in widespread use, HQL doesn’t fully conform to any particular revision of the ANSI SQL standard. It is perhaps closest to MySQL’s dialect, but with significant differences. Hive offers no support for row level inserts, updates, and deletes. Hive doesn’t support transactions. So we can't compare it with RDBMS. Hive adds extensions to provide better performance in the context of Hadoop and to integrate with custom extensions and even external programs. It is well suited for batch processing data like: Log processing, Text mining, Document indexing, Customer-facing business intelligence, Predictive modeling, hypothesis testing etc.

Hive is not designed for online transaction processing and does not offer real-time queries.

like image 51
Sandeep Singh Avatar answered Sep 20 '22 06:09

Sandeep Singh