Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Sqoop to import data from MySQL to Hive

I am using Sqoop (version 1.4.4) to import data from MySQL to Hive. The data will be a subset of one of tables, i.e. few columns from a table. Is it necessary to create table in Hive before hand. Or importing the data will create the name specified in the command if it is not in the Hive?

like image 686
Nayan Avatar asked Mar 14 '14 12:03

Nayan


3 Answers

As mentioned in the sqoop documentation, you will not have to create any hive tables if you use the --hive-import argument in your command

example:

sqoop import \
--connect jdbc:mysql://mysql_server:3306/db_name \
--username mysql_user \
--password mysql_pass \
--table table_name \
--hive-import

Also... consider the --hive-overwrite argument if you want to schedule a full data import, on a daily base for example

like image 170
Manu Eidenberger Avatar answered Sep 23 '22 21:09

Manu Eidenberger


I finally resolved the issue. It would involve two steps.

  1. Create an external hive table.
  2. Import data using Sqoop.

Creation of External table : External tables in hive are kind of permanent tables and stays there even if hive is stopped or server goes down. "EXTERNAL" keyword is used to specify table type.

CREATE EXTERNAL TABLE IF NOT EXISTS HIVEDB.HIVE_TABLE1 (DATE_COL DATE, 
BIG_INT_COL BIGINT, INT_COL INT, VARCHAR_COL VARCHAR(221), FLOAT_COL FLOAT);

Import the data using Sqoop : Specify the created table name while importing the data, instead of using "--hive-create" option.

sqoop import --connect jdbc:mysql://mysqlhost/mysqldb --username user --password 
passwd --query "SELECT table1.date_col, table1.big_int_col, table1.int_col, 
table1.varchar_col, table1.float_col FROM MYSQL_TABLE1 AS table1 WHERE 
\$CONDITIONS" --split-by table1.date_col --hive-import 
--hive-table hivedb.hive_table1 --target-dir hive_table1_data`

Data was stored permanently in Hive.

like image 44
Nayan Avatar answered Sep 23 '22 21:09

Nayan


Even if there is no table in hive, sqoop import will create it. The following worked for me :

sqoop import \
--connect jdbc:mysql://localhost/<<dbname>> \
--username <<YourMySqlUsername>> \
--password <<YourMySqlpwd>> \
--table employee \
--hive-import \
--hive-table employee_1 \
-m -1
like image 34
Subhashini Avatar answered Sep 22 '22 21:09

Subhashini