Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive dynamic partitioning

I'm trying to create a partitioned table using dynamic partitioning, but i'm facing an issue. I'm running Hive 0.12 on Hortonworks Sandbox 2.0.

set hive.exec.dynamic.partition=true;
INSERT OVERWRITE TABLE demo_tab PARTITION (land)
SELECT stadt, geograph_breite, id, t.country
FROM demo_stg t;

however it does not work.. I'm getting an Error.

Here is the Query to create the table demo_stg:

create table demo_stg
(
    country STRING,
    stadt STRING,
    geograph_breite FLOAT,
    id INT
    )
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\073";

And demo_tab:

CREATE TABLE demo_tab 
(
    stadt STRING,
    geograph_breite FLOAT,
    id INT
)
PARTITIONED BY (land STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\073";
  • The table demo_stg is also filled with data, so it's not empty.

Thanks for help :)

like image 314
Baeumla Avatar asked Jun 16 '14 07:06

Baeumla


People also ask

What is Hive dynamic partitioning?

Dynamic partitioning is the strategic approach to load the data from the non-partitioned table where the single insert to the partition table is called a dynamic partition.

What is difference between static and dynamic partition in Hive?

in static partitioning we need to specify the partition column value in each and every LOAD statement. dynamic partition allow us not to specify partition column value each time.

What are the 2 types of partitioning in Hive?

Hive Static Partitioning. Insert input data files individually into a partition table is Static Partition. Usually when loading files (big files) into Hive tables static partitions are preferred. Static Partition saves your time in loading data compared to dynamic partition.

What is the dynamic partitioning?

Dynamic partitions are a userspace partitioning system for Android. Using this partitioning system, you can create, resize, or destroy partitions during over-the-air (OTA) updates. With dynamic partitions, vendors no longer have to worry about the individual sizes of partitions such as system , vendor , and product .


2 Answers

Partition column needs to be the last column in select query.

And one more thing other than setting the partition to true you need to set mode to nonstrict:

set hive.exec.dynamic.partition.mode=nonstrict
like image 20
Azam Khan Avatar answered Oct 27 '22 16:10

Azam Khan


You need to modify your select:

set hive.exec.dynamic.partition=true;
INSERT OVERWRITE TABLE demo_tab PARTITION (land)
SELECT stadt, geograph_breite, id, t.country
FROM demo_stg t;

I am not sure to which column on your demo staging you want to perform partitioning or which column in demo corresponds to land. But whatever is the column it should be present as the last column in select say your demo table column name is id so your select should be written as:

INSERT OVERWRITE TABLE demo_tab PARTITION (land)
SELECT stadt, geograph_breite, id, t.country,t.id as land
FROM demo_stg t;

I think this should work.

like image 157
Tanveer Avatar answered Oct 27 '22 16:10

Tanveer