I am new to hive. I have successfully setup a single node hadoop cluster for development purpose and on top of it, I have installed hive and pig.
I created a dummy table in hive:
create table foo (id int, name string);
Now, I want to insert data into this table. Can I add data just like sql one record at a time? kindly help me with an analogous command to:
insert into foo (id, name) VALUES (12,"xyz);
Also, I have a csv file which contains data in the format:
1,name1 2,name2 .. .. .. 1000,name1000
How can I load this data into the dummy table?
Hive provides multiple ways to add data to the tables. We can use DML(Data Manipulation Language) queries in Hive to import or add data to the table. One can also directly put the table into the hive with HDFS commands.
You can't do insert into to insert single record. It's not supported by Hive. You may place all new records that you want to insert in a file and load that file into a temp table in Hive. Then using insert overwrite..select command insert those rows into a new partition of your main Hive table.
To insert records into a table, enter the key words insert into followed by the table name, followed by an open parenthesis, followed by a list of column names separated by commas, followed by a closing parenthesis, followed by the keyword values, followed by the list of values enclosed in parenthesis.
I think the best way is:
a) Copy data into HDFS (if it is not already there)
b) Create external table over your CSV like this
CREATE EXTERNAL TABLE TableName (id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION 'place in HDFS';
c) You can start using TableName already by issuing queries to it.
d) if you want to insert data into other Hive table:
insert overwrite table finalTable select * from table name;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With