I wrote a simple script to create a user (TestV100), create a table (Xy100) in that schema and export a tabl delimited flat file from hadoop to this oracle table.
#!/bin/bash
# Testing connectivity to Oracle DB
#sqoop eval --connect jdbc:oracle:thin:@hostname:1521:orcl -username test -password password --query "SELECT count(*) as bob FROM \"TestV1\".\"Test\"" --verbose
HOST=$1
USER=$2
PASS=$3
SCHEMA=$4
PORT=$5
SID=$6
SQOOP=/usr/bin/sqoop
JDBC="jdbc:oracle:thin:@$1:$5:$6"
SQOOP_EVAL="$SQOOP eval --connect $JDBC --username $USER --password $PASS --query"
#Create Schema and Tables;
${SQOOP_EVAL} "CREATE USER \"TestV100\" identified by \"password\""
${SQOOP_EVAL} "GRANT CONNECT TO \"TestV100\""
${SQOOP_EVAL} "ALTER USER \"TestV100\" QUOTA UNLIMITED ON USERS"
${SQOOP_EVAL} "DROP TABLE \"TestV100\".\"Xy100\""
${SQOOP_EVAL} "CREATE TABLE \"TestV100\".\"Xy100\"( \"a\" NVARCHAR2(255) DEFAULT NULL, \"x\" NUMBER(10,0) DEFAULT NULL, \"y\" NUMBER(10,0) DEFAULT NULL )"
############################
## Load Data into tables; ##
############################
SQOOP_EXPORT="/usr/bin/sudo -u hdfs $SQOOP export --connect ${JDBC} --username $USER --password $PASS --export-dir"
${SQOOP_EXPORT} "/tmp/rv/TestV100/xy100.txt" --table "\"\"$SCHEMA\".\"Xy100\"\"" --fields-terminated-by "\t" --input-null-string null -m 1
c 8 3
a 1 4
c 6 1
c 2 0
a 7 7
c 4 2
c 7 5
a 0 0
c 5 6
a 2 2
a 5 5
a 3 6
c 9 7
a 4 1
c 3 4
a 6 3
b 6 5
b 8 7
b 5 1
b 7 3
b 2 4
b 1 0
b 4 6
b 3 2
sh ./ExportOracleTestV100.sh oracle11 test password TestV100 1521 orcl --verbose
Note: 'test' user has full access on TestV100 schema.
[root@abc-repo-app1 rv]# sh ./ExportOracleTestV100.sh oracle11 test password TestV100 1521 orcl --verbose
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/11/02 12:40:07 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.4.1
15/11/02 12:40:07 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/11/02 12:40:07 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
15/11/02 12:40:07 INFO manager.SqlManager: Using default fetchSize of 1000
15/11/02 12:40:07 INFO tool.CodeGenTool: Beginning code generation
15/11/02 12:40:08 INFO manager.OracleManager: Time zone has been set to GMT
15/11/02 12:40:08 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "TestV100"."xy100" t WHERE 1=0
15/11/02 12:40:08 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.IllegalArgumentException: There is no column found in the target table "TestV100"."xy100". Please ensure that your table name is correct.
java.lang.IllegalArgumentException: There is no column found in the target table "TestV100"."xy100". Please ensure that your table name is correct.
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1658)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
As you can see above, Sqoop version is 1.4.5-cdh5.4.1
I somehow managed to get it this far. I see a lot of posting online for sqoop import and this error, and solution is to change the table name to UPPERCASE in the command. But, I am running export. Also, the oracle table HAS to be created with mixed case.
I hope I gave all required information here. Can someone please help or point to some place which can help me get past this error?
There is no straight way to list all the columns of a table in Apache Sqoop like sqoop-list-columns, so first we should retrieve the columns of the particular table and transform to a file containing the column names of particular table.
Controlling Parallelism. Sqoop imports data in parallel from most database sources. You can specify the number of map tasks (parallel processes) to use to perform the import by using the -m or --num-mappers argument. Each of these arguments takes an integer value which corresponds to the degree of parallelism to employ ...
To optimize performance, set the number of map tasks to a value lower than the maximum number of connections that the database supports. Controlling the amount of parallelism that Sqoop will use to transfer data is the main way to control the load on your database.
Sqoop performs highly efficient data transfers by inheriting Hadoop's parallelism. To help Sqoop split your query into multiple chunks that can be transferred in parallel, you need to include the $CONDITIONS placeholder in the where clause of your query.
table name in uppercase worked.
sqoop export --connect jdbc:oracle:thin:@xyzx:1569:xyz --username xyz --password xyz --table CIPHADOOPCUSTOMERREPORT --export-dir /apps/hive/warehouse/ciprpt.db/dtd_customer_report --input-fields-terminated-by "\t" --input-lines-terminated-by "\n" --verbose -m 8 --input-null-string '\N' --input-null-non-string '\N'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With