Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creation of a partitioned external table with hive: no data available

I have the following file on HDFS: enter image description here

I create the structure of the external table in Hive:

CREATE EXTERNAL TABLE google_analytics(
  `session` INT)
PARTITIONED BY (date_string string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION '/flumania/google_analytics';

ALTER TABLE google_analytics ADD PARTITION (date_string = '2016-09-06') LOCATION '/flumania/google_analytics';

After that, the table structure is created in Hive but I cannot see any data: enter image description here

Since it's an external table, data insertion should be done automatically, right?

like image 929
rom Avatar asked Sep 02 '16 16:09

rom


People also ask

How is data stored in Hive partitioned tables?

When you load the data into the partition table, Hive internally splits the records based on the partition key and stores each partition data into a sub-directory of tables directory on HDFS. The name of the directory would be partition key and it's value.

What happen if we create table external with data in Hive warehouse location?

When you create external table with out location , the data will be stored in the hive default location.

How do I drop an external Hive table without data?

The goal is to destroy a Hive schema but keep the data underneath. Given a Hive external table, created for example with script 1, it can be dropped with script 2. This deletes the data (removes the folder /user/me/data/ ). This folder has to remain for use in other projects.


1 Answers

your file should be in this sequence.

int,string

here you file contents are in below sequence

string, int

change your file to below.

86,"2016-08-20"
78,"2016-08-21"

It should work.
Also it is not recommended to use keywords as column names (date);

like image 75
dileepvarma Avatar answered Oct 14 '22 05:10

dileepvarma