Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Easy way to Sync Data between MongoDB and Apache Solr

Tags:

java

mongodb

solr

I recently started working with MongoDB and Apache Solr. I am using MongoDB as a data store and I want Apache Solr to create index for my data for the search feature in my application.

After some research I found out, there are basically 2 methods to sync the data between MongoDB and Solr.

1) using Solr DataImportHandler -

For this I used SolrMongoImporter created by james and followed his tutorial on github

I was able to successfully run the Import Handler and Solr identified the ImportHandler but it was not importing any documents into solr. Every time it said updated documents=0.

2) Then I tried switching to MongoDB side, to look if anything exists there and I found MongoDBConnector provided by 10gen.

When I followed the instructions, and ran the connector, it is trying to post lot of documents to Solr and it gives the following output.

2012-11-24 15:15:20,665 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.010 seconds.
2012-11-24 15:15:21,674 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.009 seconds.
2012-11-24 15:15:22,683 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.008 seconds.
2012-11-24 15:15:23,694 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.010 seconds.
2012-11-24 15:15:24,702 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.008 seconds.
2012-11-24 15:15:25,711 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.008 seconds.
2012-11-24 15:15:26,722 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (POST) with body '<commit />' in 0.010 seconds.

But no data is there in Solr.

I wanted to know which approach worked for you guys, and is there any good tutorial on MongoDB and Solr Integration.

Also, I am looking for a real-time sync between MongoDB and solr, i.e. as soon as any product is added to my mongodb, I want it updated in solr index and reflect in search results.

I am using MongoDB 2.0.4 and Solr 3.6.1.

like image 465
Rajesh Pantula Avatar asked Nov 24 '12 15:11

Rajesh Pantula


2 Answers

Hadoop is an option for creating SOLR indexes. I haven't done this first hand, but have heard from people such as etsy who are.

On this course at lucene revolution they talked about using hadoop to update the indexes in some SOLR cores. Unfortunately I don't think the course material is publicly available.

And at this talk the speaker talked about the mongo/hadoop support.

Other related links:

  • Indexing Files via Solr and Java MapReduce
  • Using Hadoop to Create Solr Indexes
  • Mongo-Hadoop Connector
like image 100
theon Avatar answered Sep 20 '22 10:09

theon


Did you set the replica set mode? http://docs.mongodb.org/manual/reference/replica-configuration/

In the beginning, I was getting the same output as you described although there were no data in Solr. After, I set up replication mode, it seems that oplog file was created and mongodbconnector was correctly synchronizing with SOLR. Works quite nicely for me.

like image 37
Martin Leginus Avatar answered Sep 20 '22 10:09

Martin Leginus