I have just installed and configured Apache Hive version 1.1.0. Then I have created a table by quering this query:
create table person (name1 string, surname1 string);
And then I want to add one row by:
insert into person (name1, surname1) values ("Alan", "Green");
And it cause an error:
Error: Error while compiling statement: FAILED: ParseException line 1:20 cannot recognize input near '(' 'name1' ',' in statement (state=42000,code=40000).
But when I execute query without column list it works fine:
insert into person values ("Alan", "Green");
The question is: how to specify column list in hiveQL to make insert into?
INSERT INTO table using SELECT clause. This is one of the widely used methods to insert data into Hive table. We will use the SELECT clause along with INSERT INTO command to insert data into a Hive table by selecting data from another table. Below is the syntax of using SELECT statement with INSERT command.
The easiest way to select specific columns in the Hive query is by specifying the column name in the select statement. SELECT col1, col3, col4 .... FROM Table1; But imagine your table contains many columns (i.e : more than 100 columns) and you need to only exclude a few columns in the select statement.
You can't do insert into to insert single record. It's not supported by Hive. You may place all new records that you want to insert in a file and load that file into a temp table in Hive. Then using insert overwrite..select command insert those rows into a new partition of your main Hive table.
INSERT ... VALUES, UPDATE, DELETE, and MERGE SQL statements are supported in Apache Hive 0.14 and later. The INSERT ... VALUES statement enable users to write data to Apache Hive from values provided in SQL statements. The UPDATE and DELETE statements enable users to modify and delete values already written to Hive.
According to this bug HIVE-9481, you can specify column list in INSERT statement, since 1.2.0. The syntax is like this:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)
[(column_list)]
[IF NOT EXISTS]] select_statement1 FROM from_statement;
example:
CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
PARTITIONED BY (datestamp STRING)
CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC;
INSERT INTO TABLE pageviews
PARTITION (datestamp = '2014-09-23')
(userid,link)
VALUES ('jsmith', 'mail.com');
I tested this with Hive 2.1. It works only with INSERT INTO, not with INSERT OVERWRITE
And I don't know why this syntax is not mentioned in the Apache wiki page LanguageManual DML
https://issues.apache.org/jira/browse/HIVE-9481
Insert into specific columns in the above query:
insert into table person (name1, surname1) values ("Alan", "Green");
is supported in Hive 2.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With