Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the partition column name of a Hive table

Tags:

hive

I'm developing a unix script where I'll be dealing with Hive tables partitioned by either column A or column B. I'd like to find on what column a table is partition on so that I can do subsequent operations on those partition instances.

Is there any property in Hive which returns the partition column directly?

I'm thinking I'll have to do a show create table and extract the partition name somehow if there isn't any other way possible.

like image 937
Vinay Avatar asked Aug 19 '16 02:08

Vinay


People also ask

How do I insert data into a hive partitioned table?

When inserting data into a partition, it’s necessary to include the partition columns as the last columns in the query. The column names in the source query don’t need to match the partition column names, but they really do need to be last. Below are a few more commands that are supported on Hive partitioned tables.

What is hive partition?

The hive partition is similar to table partitioning available in SQL server or any other RDBMS database tables. Partition eliminates creating smaller physical tables, accessing, and managing them separately.

How do I drop a partition in hive?

Dropping Hive Partition is pretty straight forward just remember that when you drop partition of an internal table then the data is deleted but when you drop from an external table the data remains as it is in the external location. The syntax is as below alter table tbl_nm drop if exists partition (col = ‘value’ , …..)

How to get the partition column names of hive meta store?

Through scala/java api, we can get to the hive meta store and get the partition column names org.apache.hadoop.hive.metastore.HiveMetaStoreClient Show activity on this post.


2 Answers

May be not the best, but one more approach is by using describe command

Create table:

create table employee ( id int, name string ) PARTITIONED BY (city string);

Command:

hive -e 'describe formatted employee'  | awk '/Partition/ {p=1}; p; /Detailed/ {p=0}'

Output:

# Partition Information
# col_name              data_type               comment

city                    string

you can improve it as per your need.

One more option which i dint explore is by querying meta-store repository tables to get the partition column information for a table.

like image 109
Aditya Avatar answered Oct 21 '22 07:10

Aditya


Through scala/java api, we can get to the hive meta store and get the partition column names org.apache.hadoop.hive.metastore.HiveMetaStoreClient

val conf = new Configuration()
conf.set("hive.metastore.uris","thrift://hdppmgt02.domain.com:9083")
val hiveConf = new HiveConf(conf, classOf[HiveConf])
val metastoreClient = new HiveMetaStoreClient(hiveConf)

metastoreClient.getTable(db, tbl).getPartitionKeys.foreach(x=>println("Keys : "+x))
like image 45
Karthik Avatar answered Oct 21 '22 09:10

Karthik