Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB Hadoop Integration

Tags:

mongodb

hadoop

I have installed Robomongo on my desktop.But i an mot able to ingest data into a hive table from Robomongo. I have applied the following steps:-

  1. downloaded the required jars- mongo-java-driver-2.13.3.jar, mongo-hadoop-core-1.4.0.jar,mongo-hadoop-hive-1.4.0.jar,mongodb-driver-3.2.1-javadoc.jar.

  2. I have placed the jar files in a temporary folder.

  3. In the hive script i have added these jar files. The script i used is as follows:-

ADD JAR /tmp/mongodb/jarfiles/mongo-java-driver-2.13.3.jar;

ADD JAR /tmp/mongodb/jarfiles/mongo-hadoop-core-1.4.0.jar;

ADD JAR /tmp/mongodb/jarfiles/mongo-hadoop-hive-1.4.0.jar;

ADD JAR /tmp/mongodb/jarfiles/mongodb-driver-3.2.1-javadoc.jar

CREATE TABLE individuals

( id STRING,

name STRING,

age STRING,

nationality STRING

)

STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'

WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","name":"Name","age":"Age","nationality":"Nationality"}')

TBLPROPERTIES('mongo.uri'='mongodb://localhost:port/admin.test_1');

In the local host i have given the ip address and in the port i have given the port number. admin is the database name and test_1 is the collection that i am trying to ingest. Every time i run this code i get the following error:-

Error: Error while processing statement: java.net.URISyntaxException: Relative path in absolute URI: SERDEPROPERTIES('mongo.columns.mapping'='{"id":%22_id%22,%22name%22:%22Name%22,%22age%22:%22Age%22,%22nationality%22:%22Nationality%22%7D') (state=,code=1)

When i use SERDEPROPERTIES('mongo.columns.mapping'='{}') in the above code keeping everything else intact i get the following error :-

Error: Error while processing statement: java.net.URISyntaxException: Illegal character in scheme name at index 13: TBLPROPERTIES('mongo.uri'='mongodb://localhost:port/admin.test_1') (state=,code=1)

I am using CDH 5.4. Can anyone tell me how i can resolve this issue?

like image 748
riz Avatar asked Dec 01 '25 00:12

riz


1 Answers

As mentioned in mongo-hadoop Hive installation, the connector requires at least version 3.0.0 of the driver "uber" jar (called "mongo-java-driver.jar"). You seems to be using v2.13.3 which may not have the support for Hive yet.

You can download v3+ java uber driver from MongoDB Java Driver page. Make sure you select mongo-java-driver and specific version before clicking on the download button. The jar file name should be similar to mongo-java-driver-3.x.x.jar.

like image 131
Wan Bachtiar Avatar answered Dec 02 '25 20:12

Wan Bachtiar