In a cluster having Hive installed, What does the metastore and namenode have? i understand that the Metastore has all the table schema and partition details and metadata. Now what is this metadata? then what does the namenode have? and where is this metastore present in a cluster?
The NameNode keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It also keeps track of all the DataNode(Dead+Live) through heartbeat mechanism. It also helps client for reads/writes by receiving their requests and redirecting them to the appropriate DataNode.
The metadata which metastore stores contains things like :
IDs of Database
IDs of Tables
IDs of Index
The time of creation of an Index
The time of creation of a Table
IDs of roles assigned to a particular user
InputFormat used for a Table
OutputFormat used for a Table etc etc.
Is this what you wanted to know?
And it is not mandatory to have metastore in the cluster itself. Any machine(inside or outside the cluster) having a JDBC-compliant database can be used for the metastore.
HTH
P.S : You might find the E/R diagram of metastore useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With