Sqoop job fails with KiteSDK validation error for Oracle import

Question

I am attempting to run a Sqoop job to load from an Oracle db and into Parquet format to a Hadoop cluster. The job is incremental.

Sqoop version is 1.4.6. Oracle version is 12c. Hadoop version is 2.6.0 (distro is Cloudera 5.5.1).

The Sqoop command is (this creates the job, and executes it):

$ sqoop job -fs hdfs://<HADOOPNAMENODE>:8020 \
--create myJob \
-- import \
--connect jdbc:oracle:thin:@<DBHOST>:<DBPORT>/<DBNAME> \
--username <USERNAME> \
-P \
--as-parquetfile \
--table <USERNAME>.<TABLENAME> \
--target-dir <HDFSPATH>  \
--incremental append  \
--check-column <TABLEPRIMARYKEY>

$ sqoop job --exec myJob

Error on execute:

16/02/05 11:25:30 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.ValidationException: Dataset name
05112528000000918_2088_<USERNAME>.<TABLENAME>
is not alphanumeric (plus '_')
    at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
    at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:103)
    at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:66)
    at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
    at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
    at org.kitesdk.data.Datasets.create(Datasets.java:239)
    at org.kitesdk.data.Datasets.create(Datasets.java:307)
    at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107)
    at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:80)
    at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:106)
    at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
    at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:668)
    at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:444)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
    at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
    at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

Troubleshooting Steps:

0) HDFS is stable, other Sqoop jobs are functional, Oracle source DB is up and the connection has been tested.

1) I tried creating a synonym in Oracle, that way I could simply have the --table option as:

--table TABLENAME (without the username)

This gave me an error that the table name was not correct. It needs the full USERNAME.TABLENAME for the --table option.

Error:

16/02/05 12:04:46 ERROR tool.ImportTool: Imported Failed: There is no column found in the target table <TABLENAME>. Please ensure that your table name is correct.

2) I made sure that this is a Parquet issue. I removed the --as-parquetfile option and the job was successful.

3) I wondered if this is somehow caused by the incremental options. I removed the --incremental append & --check-column options and the job was successful. This confuses me.

4) I tried the job with MySQL and it was successful.

Has anyone run into something similar? Is there a way (or is it even advisable) to disable the Kite validation? It seems that the dataset is being created with dots ("."), which then Kite SDK complains about - but this is an assumption on my part as I am not too familiar with Kite SDK.

Thanks in advance,

Jose

Jose M CV · Accepted Answer

Resolved. There seems to be a known issue with the JDBC connectivity to Oracle 12c. Using a specific OJDBC6 (instead of 7) did the trick. FYI - the OJDBC is installed in /usr/share/java/ and a symbolic link is created in /installpath.../lib/sqoop/lib/

Sqoop job fails with KiteSDK validation error for Oracle import

Tags:

oracle-database

hadoop

cloudera

sqoop

Jose M CV

1 Answers

Jose M CV

Recent Activity

Donate For Us

Sqoop job fails with KiteSDK validation error for Oracle import

Tags:

oracle-database

hadoop

cloudera

sqoop

Jose M CV

1 Answers

Jose M CV

Related questions

Recent Activity

Donate For Us