Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing a basic recommendation engine [closed]

I'm looking to write a basic recommendation engine that will take and store a list of numeric IDs (which relate to books), compare those to other users with a high volume of identical IDs and recommend additional books based on those finds.

After a bit of Googling, I've found this article, which discusses an implementation of a Slope One algorithm, but seems to rely on users rating the items being compared. Ideally, I'd like to achieve this without the need for users to provide ratings. I'm assuming that if the user has this book in their collection, they are fond of it.

While it strikes me that I could default a rating of 10 for each book, I'm wondering if there's a more efficient algorithm I could be using. Ideally I'd like to calculate these recommendations on the fly (avoiding batch calculation). Any suggestions would be appreciated.

like image 460
ndg Avatar asked Oct 29 '10 12:10

ndg


People also ask

How do you write a simple recommendation system?

Simple recommenders: offer generalized recommendations to every user, based on movie popularity and/or genre. The basic idea behind this system is that movies that are more popular and critically acclaimed will have a higher probability of being liked by the average audience. An example could be IMDB Top 250.

What are the steps in recommendation engine?

Steps Involved in Collaborative Filtering To build a system that can automatically recommend items to users based on the preferences of other users, the first step is to find similar users or items. The second step is to predict the ratings of the items that are not yet rated by a user.

What is an example of recommendation engine?

Netflix, YouTube, Tinder, and Amazon are all examples of recommender systems in use. The systems entice users with relevant suggestions based on the choices they make.

What makes a good recommendation engine?

Recommendation engines also use customer attribute data such as demographics (age, gender) and psychographics (interests, values) to identify similar customers, as well as feature data (genre, item type) to identify product similarity. Step 2: Data storage Once the data is gathered, it needs to be stored.

What is a recommendation engine?

A recommendation engine (sometimes referred to as a recommender system) is a tool that lets algorithm developers predict what a user may or may not like among a list of given items.

How to create a recommendation engine for a movie?

Now, whenever we want to create the recommendation engine, for each and every movie we have to create a vector of the matrix. The reason to create a vector is that our recommendation engine depends upon the pairwise similarity. To create this similarity we have to design vectors for each movie.

What are the development steps of content based recommendation engine?

The development steps of content based recommendation engine are: Data Load and Preprocess: this step is to load data from the various data sources then build up a NLP data pipeline to convert the text data into a ready-to-use feature vectors.

What can recommendation engines teach us about user intent?

Recommendation engines have been around for a while and there have been some key learnings to leverage: A user’s actions are the best indicator of user intent. Ratings and feedback tends to be very biased and lower volumes.


1 Answers

A basic algorithm for your task is a collaborative memory-based recommender system. It's quite easy to implement, especially when your items (in your case books) just have IDs and no other features.

But, as you already said, you need some kind of rating from the users for the items. But don't think of a rating like in 1 to 5 stars, but more like a binary choice like 0 (book not read) and 1 (book read), or interested in or not interested in.

Then use an appropriate distance measure to calculate the difference between all users (and their sets of items) and yourself, select the n most similar users to yourself (of whoever the active user is) and pick out their items you haven't rated (or considered, choice 0).

I think in this case, a good distance measure would be the 1-norm distance, or sometimes called the Manhattan distance. But this is a point where you have to experiment with your dataset to get the best results.

A nice introduction to this topic is the paper by Breese et al., Empirical Analysis of Predictive Algorithms for Collaborative Filtering. Available here (PDF). For an research paper, it's an easy read.

like image 136
dermatthias Avatar answered Sep 22 '22 16:09

dermatthias