Sqoop import without primary key in RDBMS

Question

Can I import RDBMS table data (table doesn't have a primary key) to hive using sqoop? If yes, then can you please give the sqoop import command.

I have tried with sqoop import general command, but it failed.

Prasad Khode · Accepted Answer

If your table has no primary key defined then you have to give -m 1 option for importing the data or you have to provide --split-by argument with some column name, otherwise it gives the error:

ERROR tool.ImportTool: Error during import: No primary key could be found for table <table_name>. Please specify one with --split-by or perform a sequential import with '-m 1'

then your sqoop command will look like

sqoop import \
    --connect jdbc:mysql://localhost/test_db \
    --username root \
    --password **** \
    --table user \
    --target-dir /user/root/user_data \
    --columns "first_name, last_name, created_date"
    -m 1

or

sqoop import \
    --connect jdbc:mysql://localhost/test_db \
    --username root \
    --password **** \
    --table user \
    --target-dir /user/root/user_data \
    --columns "first_name, last_name, created_date"
    --split-by created_date

Tony79 · Answer

Quick view:

The Sqoop job fails and the error looks like this" Error during import: No primary key could be found for the table . Please specify one with --split-by or perform a sequential import with '-m 1' "

Description: Usually, when you perform a Sqoop job internally it searches for the primary key in the table. If there is no primary key the Sqoop job fails and the error looks like this" Error during import: No primary key could be found for the table . Please specify one with --split-by or perform a sequential import with '-m 1' ". The suggestion describes there are two alternative approaches to this scenario.

Best way is option 2

To specify the number of mappers as 1 (default it takes 4). So by specifying the number of mappers to 1, the task will be sequential and identical to a single threaded task. This will succeed only when you are targeting a small table if in case if you are looking for a large import this will fail as the task tends to run forever.
The best approach is to use split-by where you can specify the number of mappers on the bases of indexed columns or splitting column manually( with queries ).

Sqoop import without primary key in RDBMS

Tags:

import

sqoop

KM Prak

2 Answers

Prasad Khode

The Sqoop job fails and the error looks like this" Error during import: No primary key could be found for the table . Please specify one with --split-by or perform a sequential import with '-m 1' "

Tony79

Recent Activity

Donate For Us

Sqoop import without primary key in RDBMS

Tags:

import

sqoop

KM Prak

2 Answers

Prasad Khode

The Sqoop job fails and the error looks like this" Error during import: No primary key could be found for the table . Please specify one with --split-by or perform a sequential import with '-m 1' "

Tony79

Related questions

Recent Activity

Donate For Us