Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hive - is it possible to create a column from another column

Tags:

sql

hive

hiveql

I was wondering if it is possible to create a new column from an existing column in hive.

Lets say I have a table People (name string, age int) and I want to add a column is_old string which would be defined as if(age > 70, 'old', 'not_old'). Is there a way to do this?

The only way I can think of currently is to create a new table People_temp from the old one, delete the old table and rename the new table like so:

create table People_new as select name, age, if(age > 70, 'old', 'not_old') as is_old from People;
drop table People;
alter table People_new rename People;

** Is there a way of doing this without creating a temp table?
(for instance, oracle has the idea of calculated columns.)

like image 263
anthonybell Avatar asked Jun 29 '16 03:06

anthonybell


People also ask

How do you add a column after a column in Hive?

This command moves column_name after column_name2: alter table table_name change column column_name column_name column_name_type after column_name2; You have to put the column_name twice (or you can change column name) and type of the column.

How do I add a column to an existing Hive table?

Yes, we can add column inside a table in Hive using a command: ALTER TABLE table_name ADD COLUMNS (column _name datatypes); I hope this will work.

How do I select a column in Hive?

The easiest way to select specific columns in the Hive query is by specifying the column name in the select statement. SELECT col1, col3, col4 .... FROM Table1; But imagine your table contains many columns (i.e : more than 100 columns) and you need to only exclude a few columns in the select statement.


2 Answers

Yes, there is a way to do it WITHOUT CREATING TEMP TABLE. Add column then insert overwrite table select from itself :

ALTER TABLE People ADD COLUMNS (is_old string);

INSERT OVERWRITE TABLE People 
SELECT name, age, if(age > 70, 'old', 'not_old') as is_old
  FROM People ; 
like image 131
leftjoin Avatar answered Oct 05 '22 05:10

leftjoin


You are almost there, you should try it.

Use CASE WHEN:

,CASE WHEN age > 70 THEN ''old' ELSE 'not_old' END AS is_old string

use IF:

,IF(age > 70, 'old', 'not_old') AS is_old string
like image 21
Blank Avatar answered Oct 05 '22 05:10

Blank