Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommendation Systems using Solr and Mahout [closed]

I've been reading about using Solr and Mahout for developing Recommendation Systems.

As I understood they handles two different problems.

  1. Since Solr is a search engine+classification system, it is used mostly for recommendations like "more like this" in Drupal - http://jamidwyer.com/d7/node/21 .

(or "Related" feature in StackOverflow)

  1. In the case of Mahout,it implements machine learning algorithms like Collaborative Filtering.It can be used to implement features like suggestions in Amazon based on users previous actions.(likes,bought items)

My questions ,

Are they used to address two different problems ?

Can they be integrated ?

I read Mahout does offline processing and scalable. Does this mean Solr cannot be scaled ?

like image 567
Ashika Umanga Umagiliya Avatar asked Nov 28 '12 01:11

Ashika Umanga Umagiliya


3 Answers

These are different tools for different problems. Solr doesn't really make recommendations, it suggests similar documents based on contents. This is not personalized in the sense that it doesn't relate to the user. It's very good at this specific problem.

Taste / Mahout are for collaborative filtering, which is not specific to documents or any other type of thing, and differs crucially in the "similar items" and recommendations are based on user-item interactions, not item properties.

Both scale well, depending on what you need and mean. There is no reason to doubt Solr.

Regarding Mahout and recommenders, briefly, it has two pieces. One piece (Taste) is real-time, not Hadoop-based, and scales to moderate data sets (maybe 10M data points) on one machine. Mahout then adds a Hadoop-based, not-real-time, batch implementation that can scale larger. (Ad: I'm the primary author of the above, and am at work on a next-gen system based on both called Myrrix. It will appeal if you are interested in both scalable and real-time Mahout-style recommenders.)

If you are interested in a company putting together a platform based on the above, including Solr, you should look at NGDATA.

like image 100
Sean Owen Avatar answered Oct 17 '22 04:10

Sean Owen


you're right they address two different problems and so far I haven't seen / found any existing integration which would work out of the box.

What you could do is to use the Mahout classification results to add further information to your indexed documents which can then be used for boosting purposes.

Regarding your last answer - Solr can scale, with the just released version 4.0 it can even scale better then before. But it solves a different purpose and scales well for it.

You question is a bit unspecific so I hope this helps in some way.

Cheers

like image 29
pagid Avatar answered Oct 17 '22 02:10

pagid


If you're willing to get your hands dirty, you can actually use Solr + Collaborative Filtering to make a really sweet search-aware recommendation system. That is, given a search S and given the searcher's purchase history P_i and given everyone else's purchase histories P_j where j≠i. Then you can return results that satisfy the search S but which are boosted based upon items that the searcher would probably like based upon other similar users.

Here's a blog post that I wrote that could point you in the right direction: http://opensourceconnections.com/blog/2013/10/05/search-aware-product-recommendation-in-solr/

like image 22
JnBrymn Avatar answered Oct 17 '22 04:10

JnBrymn