Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to specify column list in hive insert into query

I have just installed and configured Apache Hive version 1.1.0. Then I have created a table by quering this query:

create table person (name1 string, surname1 string);

And then I want to add one row by:

insert into person (name1, surname1) values ("Alan", "Green");

And it cause an error:

Error: Error while compiling statement: FAILED: ParseException line 1:20 cannot recognize input near '(' 'name1' ',' in statement (state=42000,code=40000).

But when I execute query without column list it works fine:

insert into person values ("Alan", "Green");

The question is: how to specify column list in hiveQL to make insert into?

like image 406
Vasli Slavik Avatar asked Mar 17 '15 09:03

Vasli Slavik


People also ask

How do I add data to a specific column in Hive?

INSERT INTO table using SELECT clause. This is one of the widely used methods to insert data into Hive table. We will use the SELECT clause along with INSERT INTO command to insert data into a Hive table by selecting data from another table. Below is the syntax of using SELECT statement with INSERT command.

How do I query a column in Hive?

The easiest way to select specific columns in the Hive query is by specifying the column name in the select statement. SELECT col1, col3, col4 .... FROM Table1; But imagine your table contains many columns (i.e : more than 100 columns) and you need to only exclude a few columns in the select statement.

How do I use insert in Hive?

You can't do insert into to insert single record. It's not supported by Hive. You may place all new records that you want to insert in a file and load that file into a temp table in Hive. Then using insert overwrite..select command insert those rows into a new partition of your main Hive table.

Does Hive support insert?

INSERT ... VALUES, UPDATE, DELETE, and MERGE SQL statements are supported in Apache Hive 0.14 and later. The INSERT ... VALUES statement enable users to write data to Apache Hive from values provided in SQL statements. The UPDATE and DELETE statements enable users to modify and delete values already written to Hive.


2 Answers

According to this bug HIVE-9481, you can specify column list in INSERT statement, since 1.2.0. The syntax is like this:

INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) 
[(column_list)]
[IF NOT EXISTS]] select_statement1 FROM from_statement;

example:

CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
PARTITIONED BY (datestamp STRING) 
CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC;

INSERT INTO TABLE pageviews 
PARTITION (datestamp = '2014-09-23')
(userid,link) 
VALUES ('jsmith', 'mail.com');

I tested this with Hive 2.1. It works only with INSERT INTO, not with INSERT OVERWRITE

And I don't know why this syntax is not mentioned in the Apache wiki page LanguageManual DML

https://issues.apache.org/jira/browse/HIVE-9481

like image 127
yetsun Avatar answered Oct 12 '22 17:10

yetsun


Insert into specific columns in the above query:

insert into table person (name1, surname1) values ("Alan", "Green");

is supported in Hive 2.0

like image 29
Aditya Avatar answered Oct 12 '22 15:10

Aditya