Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between --warehouse-dir and --target-dir commands in sqoop

Tags:

sqoop

I could not understand the difference between the following commands in sqoop. It would be better if someone could explain with small examples.

 --warehouse-dir and --target-dir 

Thanks

like image 445
sree Avatar asked May 24 '16 14:05

sree


People also ask

What are the 2 main functions of Sqoop?

Sqoop has two main functions: importing and exporting. Importing transfers structured data into HDFS; exporting moves this data from Hadoop to external databases in the cloud or on-premises. Importing involves Sqoop assessing the external database's metadata before mapping it to Hadoop.

What will happen if Target dir already exists during Sqoop import?

By default, imports go to a new target location. If the destination directory already exists in HDFS, Sqoop will refuse to import and overwrite that directory's contents.

What is the Warehouse directory in Sqoop import?

Warehouse-dir creates the parent directory in which all your tables will be stored in the folders which are named after the table name. If you are importing table by table, each time you need to provide the distinctive target-directory location as target-directory location can't be same in each import.

What is Sqoop target directory?

We can specify the target directory while importing table data into HDFS using the Sqoop import tool. Following is the syntax to specify the target directory as option to the Sqoop import command. The following command is used to import emp_add table data into '/queryresult' directory.


2 Answers

As I got in case of import:

--warehouse-dir : It create a directory which works as database directory (sqoop_db_movies) and table name (as given in import command) directory automatically created with imported files with in warehouse dir(database directory).

Example: sqoop import --options-file /home/cloudera/sqoop/conn --table movies --warehouse-dir /sqoop_db_movies -m 1

Output as:

/sqoop_db_movies/movies

/sqoop_db_movies/movies/_SUCCESS

/sqoop_db_movies/movies/part-m-00000

--target-dir: It create a directory which work as table name (sqoop_table_movies) with imported files.

Example: sqoop import --options-file /home/cloudera/sqoop/conn --table movies --target-dir /sqoop_table_movies -m 1

Output as:

/sqoop_table_movies/_SUCCESS

/sqoop_table_movies/part-m-00000

like image 199
Santosh Singh Avatar answered Oct 04 '22 03:10

Santosh Singh


Below parameter points to default hive table location.It can be used for dev purpose, where you just want to perform some tests on internal tables.

--warehouse-dir

Below parameter points to some hdfs location, where you can mount external hive tables.This is useful in production environment, where you want every data to be available to some external dir and external table.

--target-dir

like image 31
sumitya Avatar answered Oct 04 '22 02:10

sumitya