Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Sqoop in Java Program?

Tags:

I know how to use sqoop through command line. But dont know how to call sqoop command using java programs . Can anyone give some code view?

like image 268
pradeep Avatar asked Feb 10 '12 14:02

pradeep


People also ask

How do I use Sqoop in Java?

You can run sqoop from inside your java code by including the sqoop jar in your classpath and calling the Sqoop. runTool() method. You would have to create the required parameters to sqoop programmatically as if it were the command line (e.g. --connect etc.).

How do I read a Sqoop file?

Importing a Table. Sqoop tool 'import' is used to import table data from the table to the Hadoop file system as a text file or a binary file. The following command is used to import the emp table from MySQL database server to HDFS. If it is executed successfully, then you get the following output.


2 Answers

You can run sqoop from inside your java code by including the sqoop jar in your classpath and calling the Sqoop.runTool() method. You would have to create the required parameters to sqoop programmatically as if it were the command line (e.g. --connect etc.).

Please pay attention to the following:

  • Make sure that the sqoop tool name (e.g. import/export etc.) is the first parameter.
  • Pay attention to classpath ordering - The execution might fail because sqoop requires version X of a library and you use a different version. Ensure that the libraries that sqoop requires are not overshadowed by your own dependencies. I've encountered such a problem with commons-io (sqoop requires v1.4) and had a NoSuchMethod exception since I was using commons-io v1.2.
  • Each argument needs to be on a separate array element. For example, "--connect jdbc:mysql:..." should be passed as two separate elements in the array, not one.
  • The sqoop parser knows how to accept double-quoted parameters, so use double quotes if you need to (I suggest always). The only exception is the fields-delimited-by parameter which expects a single char, so don't double-quote it.
  • I'd suggest splitting the command-line-arguments creation logic and the actual execution so your logic can be tested properly without actually running the tool.
  • It would be better to use the --hadoop-home parameter, in order to prevent dependency on the environment.
  • The advantage of Sqoop.runTool() as opposed to Sqoop.Main() is the fact that runTool() return the error code of the execution.

Hope that helps.

final int ret = Sqoop.runTool(new String[] { ... }); if (ret != 0) {   throw new RuntimeException("Sqoop failed - return code " + Integer.toString(ret)); } 

RL

like image 62
Harel Ben Attia Avatar answered Sep 18 '22 17:09

Harel Ben Attia


Find below a sample code for using sqoop in Java Program for importing data from MySQL to HDFS/HBase. Make sure you have sqoop jar in your classpath:

        SqoopOptions options = new SqoopOptions();         options.setConnectString("jdbc:mysql://HOSTNAME:PORT/DATABASE_NAME");         //options.setTableName("TABLE_NAME");         //options.setWhereClause("id>10");     // this where clause works when importing whole table, ie when setTableName() is used         options.setUsername("USERNAME");         options.setPassword("PASSWORD");         //options.setDirectMode(true);    // Make sure the direct mode is off when importing data to HBase         options.setNumMappers(8);         // Default value is 4         options.setSqlQuery("SELECT * FROM user_logs WHERE $CONDITIONS limit 10");         options.setSplitByCol("log_id");          // HBase options         options.setHBaseTable("HBASE_TABLE_NAME");         options.setHBaseColFamily("colFamily");         options.setCreateHBaseTable(true);    // Create HBase table, if it does not exist         options.setHBaseRowKeyColumn("log_id");          int ret = new ImportTool().run(options); 

As suggested by Harel, we can use the output of the run() method for error handling. Hoping this helps.

like image 45
VikasG Avatar answered Sep 21 '22 17:09

VikasG