Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the column order matter in INSERT OVERWRITE statement in Hive?

Tags:

hive

Lets say I am using INSERT OVERWRITE TABLE command with partition:-

INSERT OVERWRITE TABLE target PARTITION (date_id = ${hiveconf:DateId})
SELECT a as columnA,
       b as columnB,
       c as columnC from sourcetable;

And lets say the order of columns in the target table is not the same as specified in the insert overwrite/select. Does it matter? Because I am specifying column names explicitly using AS clause ideally it should not matter. But I am getting some data which is getting swapped on the target side. I am strongly suspecting that probably column order matters. So just wanted to confirm.

like image 375
Dhiraj Avatar asked Apr 06 '17 08:04

Dhiraj


1 Answers

The order of the columns is the only thing that matters.
There should be an exact fit between the source and the target.

P.s.
I'm not aware of any SQL provider that support matching between target and source by columns' names.
The ISO SQL support a list of columns for the target table, e.g.:

insert into trg (col4, col1, col2)
select ... , ... , ... from ...

This is currently not supported by Hive.

like image 141
David דודו Markovitz Avatar answered Oct 07 '22 00:10

David דודו Markovitz