Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In a hadoop cluster, should hive be installed on all nodes?

I am a newbie to Hadoop / Hive and I have just started reading the docs. There are lots of blogs on installing Hadoop in cluster mode. Also, I know that Hive runs on top of Hadoop. My question is: Hadoop is installed on all the cluster nodes. Should I also install Hive on all the cluster nodes or only on the master node?

like image 381
Vijay Avatar asked Dec 10 '11 11:12

Vijay


People also ask

Where is hive installed in production Hadoop cluster?

all in the Master Node.

How many nodes does Hadoop cluster have?

Master Node – Master node in a hadoop cluster is responsible for storing data in HDFS and executing parallel computation the stored data using MapReduce. Master Node has 3 nodes – NameNode, Secondary NameNode and JobTracker.

What does each node correspond to in a Hadoop cluster?

One can scale out a Hadoop cluster, which means add more nodes. Hadoop is said to be linearly scalable. That means for every node you add you get a corresponding boost in throughput. More generally if you have n nodes then adding 1 mode give you (1/n) additional computing power.

Why do the nodes are removed and added frequently in a Hadoop cluster?

Basically, in a Hadoop cluster a Manager node will be deployed on a reliable hardware with high configurations, the Slave node's will be deployed on commodity hardware. So chance's of data node crashing is more . So more frequently you will see admin's remove and add new data node's in a cluster.


1 Answers

No, it is not something you install on worker nodes. Hive is a Hadoop client. Just run Hive according to the instructions you see at the Hive site.

like image 109
Sean Owen Avatar answered Oct 18 '22 21:10

Sean Owen