Hive version 1.1
I have a hive external table as below:
CREATE EXTERNAL TABLE `schedule_events`(
`schedule_id` string COMMENT 'from deserializer',
`service_key` string COMMENT 'from deserializer',
`event_start_date_time` string COMMENT 'from deserializer',
`event_id` string COMMENT 'from deserializer',
`event_type` string COMMENT 'from deserializer',
`transitional_key` string COMMENT 'from deserializer',
`created_date_time` string COMMENT 'from deserializer',
`bus_date` string COMMENT 'from deserializer')
PARTITIONED BY (
`year` string,
`month` string,
`day` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
'hdfs://nameservice1/hadoop/raw/omega/scheduled_events'
TBLPROPERTIES (
'avro.schema.url'='hdfs:////hadoop/raw/omega/schema/schedule_events.avsc',
'transient_lastDdlTime'='1505742141')
Now to drop a particular partition I can run a ALTER command as below
ALTER TABLE schedule_events DROP IF EXISTS PARTITION (year='2016',month='06',day='01')
Dropped the partition year=2016/month=06/day=01
hive> show partitions schedule_events;
OK
year=2017/month=09/day=01
year=2017/month=09/day=02
year=2017/month=09/day=03
year=2017/month=09/day=04
year=2017/month=09/day=05
But this table is having many partitions.
How do I drop all existing partitions at once? I would like to delete all existing partitions at once? Is that possible?
You can remove multiple partitions or subpartitions from a range or list partitioned table with the DROP PARTITION and DROP SUBPARTITION clauses of the SQL ALTER TABLE statement. For example, the following SQL statement drops multiple partitions from the range-partitioned table sales .
Hive drop or delete partition is performed using ALTER TABLE tablename DROP command. Dropping a partition from a table removes the data from HDFS and from Hive Metastore. When dropping a partition that doesn't exist, it returns an error.
The MSCK REPAIR TABLE command scans a file system such as Amazon S3 for Hive compatible partitions that were added to the file system after the table was created. MSCK REPAIR TABLE compares the partitions in the table metadata and the partitions in S3.
There are multiple options, here is one:
alter table schedule_events drop if exists partition (year<>'');
Hive: Extend ALTER TABLE DROP PARTITION syntax to use all comparators
"... To drop a partition from a Hive table, this works:
ALTER TABLE foo DROP PARTITION(ds = 'date')
...but it should also work to drop all partitions prior to date.
ALTER TABLE foo DROP PARTITION(ds < 'date') This task is to implement ALTER TABLE DROP PARTITION for all of the comparators, < > <= >= <> = != instead of just for ="https://issues.apache.org/jira/browse/HIVE-2908
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With