1- I'm trying to delete multiple partitions at once, but struggling to do it with either Impala or Hive. I tried the following query, with and without '
:
ALTER TABLE cz_prd_corrti_st.s1mme_transstats_info DROP IF EXISTS
PARTITION (pr_load_time='20170701000317')
PARTITION (pr_load_time='20170701000831')
The error I'm getting is as follow:
AnalysisException: Syntax error in line 3: PARTITION (pr_load_time='20170701000831') ^ Encountered: PARTITION Expected: CACHED, LOCATION, PURGE, SET, UNCACHED CAUSED BY: Exception: Syntax error
The partition column is bigint type, query for deleting only one partition works as expected:
ALTER TABLE cz_prd_corrti_st.s1mme_transstats_info DROP IF EXISTS
PARTITION (pr_load_time='20170701000317')
2- Is it a good practice delete the hdfs data first and then drop the partitions in Impala/Hive, or is it supposed to be done vice versa?
Drop multiple partitions With the below alter script, we provide the exact partitions we would like to delete. hive> ALTER TABLE sales drop if exists partition (year = 2020, quarter = 1), partition (year = 2020, quarter = 2);
ALTER TABLE myTable DROP IF EXISTS PARTITION (date>='date1' and date<='date2'); ALTER TABLE myTable DROP IF EXISTS PARTITION (date>='date1' && date<='date2'); ALTER TABLE myTable DROP IF EXISTS PARTITION (date between 'date1' and 'date2'); hive.
To drop or alter multiple partitions: For example, you might drop a group of partitions corresponding to a particular date range after the data "ages out": alter table historical_data drop partition (year < 1995); alter table historical_data drop partition (year = 1996 and month between 1 and 6);
Partitioning is a way of dividing a table into related parts based on the values of particular columns like date, city, and department. Each table in the hive can have one or more partition keys to identify a particular partition. Using partition it is easy to do queries on slices of the data.
Your syntax is wrong.
In the DROP command the partitions should be separated by commas.
Demo
hive> create table t (i int) partitioned by (p int);
OK
hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK
hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5
hive> alter table t drop if exists partition (p=1),partition (p=2),partition(p=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK
hive> show partitions t;
OK
partition
p=4
p=5
You can drop a range.
Demo
hive> create table t (i int) partitioned by (p int);
OK
hive> alter table t add partition (p=1) partition(p=2) partition(p=3) partition(p=4) partition(p=5);
OK
hive> show partitions t;
OK
partition
p=1
p=2
p=3
p=4
p=5
hive> alter table t drop if exists partition (p<=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK
hive> show partitions t;
OK
partition
p=4
p=5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With