Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dropping multiple partitions in Impala/Hive

1- I'm trying to delete multiple partitions at once, but struggling to do it with either Impala or Hive. I tried the following query, with and without ':

ALTER TABLE cz_prd_corrti_st.s1mme_transstats_info DROP IF EXISTS PARTITION (pr_load_time='20170701000317') PARTITION (pr_load_time='20170701000831')

The error I'm getting is as follow:

AnalysisException: Syntax error in line 3: PARTITION (pr_load_time='20170701000831') ^ Encountered: PARTITION Expected: CACHED, LOCATION, PURGE, SET, UNCACHED CAUSED BY: Exception: Syntax error

The partition column is bigint type, query for deleting only one partition works as expected:

ALTER TABLE cz_prd_corrti_st.s1mme_transstats_info DROP IF EXISTS
PARTITION   (pr_load_time='20170701000317')

2- Is it a good practice delete the hdfs data first and then drop the partitions in Impala/Hive, or is it supposed to be done vice versa?

like image 893
k_mishap Avatar asked Aug 07 '17 09:08

k_mishap


People also ask

How do I drop multiple partitions at a time in Hive?

Drop multiple partitions With the below alter script, we provide the exact partitions we would like to delete. hive> ALTER TABLE sales drop if exists partition (year = 2020, quarter = 1), partition (year = 2020, quarter = 2);

How do I drop a range of partitions in Hive?

ALTER TABLE myTable DROP IF EXISTS PARTITION (date>='date1' and date<='date2'); ALTER TABLE myTable DROP IF EXISTS PARTITION (date>='date1' && date<='date2'); ALTER TABLE myTable DROP IF EXISTS PARTITION (date between 'date1' and 'date2'); hive.

How do I drop a partition on an Impala?

To drop or alter multiple partitions: For example, you might drop a group of partitions corresponding to a particular date range after the data "ages out": alter table historical_data drop partition (year < 1995); alter table historical_data drop partition (year = 1996 and month between 1 and 6);

Can we use multiple partitions in Hive?

Partitioning is a way of dividing a table into related parts based on the values of particular columns like date, city, and department. Each table in the hive can have one or more partition keys to identify a particular partition. Using partition it is easy to do queries on slices of the data.


1 Answers

1.

Your syntax is wrong.
In the DROP command the partitions should be separated by commas.

Demo

hive> create table t (i int) partitioned by (p int);
OK

hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK

hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5

hive> alter table t drop if exists partition (p=1),partition (p=2),partition(p=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK

hive> show partitions t;
OK
partition
p=4
p=5

2.

You can drop a range.

Demo

hive> create table t (i int) partitioned by (p int);
OK

hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK

hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5

hive> alter table t drop if exists partition (p<=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK

hive> show partitions t;
OK
partition
p=4
p=5
like image 82
David דודו Markovitz Avatar answered Sep 23 '22 08:09

David דודו Markovitz