Why we need to move external table to managed hive table?

Question

I am new to Hadoop and learning Hive.

In Hadoop definative guide 3rd edition page no. 428 last paragraph

I don't understand below paragraph regarding external table in HIVE.

"A common pattern is to use an external table to access an initial dataset stored in HDFS (created by another process), then use a Hive transform to move the data into a managed Hive table."

Can anybody explain briefly what above phrase says?

dimamah · Accepted Answer

Usually the data in the initial dataset is not constructed in the optimal way for queries.
You may want to modify the data (like modifying some columns adding columns, making aggregation etc) and to store it in a specific way (partitions / buckets / sorted etc) so that the queries would benefit from these optimizations.

Why we need to move external table to managed hive table?

Tags:

hadoop

hive

external-tables

Raj

1 Answers

dimamah

Recent Activity

Donate For Us

Why we need to move external table to managed hive table?

Tags:

hadoop

hive

external-tables

Raj

1 Answers

dimamah

Related questions

Recent Activity

Donate For Us