Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to change partition metadata in HIVE?

This is an extension of a previous question I asked: How to compare two columns with different data type groups

We are exploring the idea of changing the metadata on the table as opposed to performing a CAST operation on the data in SELECT statements. Changing the metadata in the MySQL metastore is easy enough. But, is it possible to have that metadata change applied to partitions (they are daily)? Otherwise, we might be stuck with current and future data being of type BIGINT while the historical is STRING.

Question: Is it possible to change partition meta data in HIVE? If yes, how?

like image 779
J Weezy Avatar asked Oct 19 '25 00:10

J Weezy


2 Answers

You can change partition column type using this statement:

alter table {table_name} partition column ({column_name} {column_type});

Also you can re-create table definition and change all columns types using these steps:

  1. Make your table external, so it can be dropped without dropping the data

    ALTER TABLE abc SET TBLPROPERTIES('EXTERNAL'='TRUE');

  2. Drop table (only metadata will be removed).

  3. Create EXTERNAL table using updated DDL with types changed and with the same LOCATION.

  4. recover partitions:

    MSCK [REPAIR] TABLE tablename;

The equivalent command on Amazon Elastic MapReduce (EMR)'s version of Hive is:

ALTER TABLE tablename RECOVER PARTITIONS;

This will add Hive partitions metadata. See manual here: RECOVER PARTITIONS

  1. And finally you can make you table MANAGED again if necessary:

ALTER TABLE tablename SET TBLPROPERTIES('EXTERNAL'='FALSE');

Note: All commands above should be ran in HUE, not MySQL.

like image 131
leftjoin Avatar answered Oct 20 '25 16:10

leftjoin


You can not change the partition column in hive infact Hive does not support alterting of partitioning columns

Refer : altering partition column type in Hive

You can think of it this way - Hive stores the data by creating a folder in hdfs with partition column values - Since if you trying to alter the hive partition it means you are trying to change the whole directory structure and data of hive table which is not possible exp if you have partitioned on year this is how directory structure looks like

tab1/clientdata/2009/file2
tab1/clientdata/2010/file3

If you want to change the partition column you can perform below steps

  1. Create another hive table with required changes in partition column

    Create table new_table ( A int, B String.....)

  2. Load data from previous table

    Insert into new_table partition ( B ) select A,B from table Prev_table

like image 45
Strick Avatar answered Oct 20 '25 14:10

Strick