I'm building a content-based movie recommender system. It's simple, just let a user enter a movie title and the system will find a movie which has the most similar features.
After calculating similarity and sorting the scores in descending order, I find the corresponding movies of 5 highest similarity scores and return to users.
Everything works well till now when I want to evaluate the accuracy of the system. Some formulas that I found on Google just evaluate the accuracy based on rating values (comparing predicted rating and actual rating like RMSE). I did not change similarity score into rating (scale from 1 to 5) so I couldn't apply any formula.
Can you suggest any way to convert similarity score into predicted rating so that I can apply RMSE then? Or is there any idea of solution to this problem ?
Common Metrics UsedPredictive accuracy metrics, classification accuracy metrics, rank accuracy metrics, and non-accuracy measurements are the four major types of evaluation metrics for recommender systems.
Based on a rule of thumb, it can be said that RMSE values between 0.2 and 0.5 shows that the model can relatively predict the data accurately. In addition, Adjusted R-squared more than 0.75 is a very good value for showing the accuracy. In some cases, Adjusted R-squared of 0.4 or more is acceptable as well.
The model can only make recommendations based on existing interests of the user. In other words, the model has limited ability to expand on the users' existing interests.
How do Content Based Recommender Systems work? A content based recommender works with data that the user provides, either explicitly (rating) or implicitly (clicking on a link). Based on that data, a user profile is generated, which is then used to make suggestions to the user.
Do you have any ground truth? For instance, do you have information about the movies that a user has liked/seen/bought in the past? It doesn't have to be a rating but in order to evaluate the recommendation you need to know some information about the user's preferences.
If you do, then there are other ways to measure the accuracy besides RMSE. RMSE is used when we predict ratings (as you said is the error between the real rating and the prediction) but in your case you are generating top N recommendations. In that case you can use precision and recall to evaluate your recommendations. They are very used in Information Retrieval applications (see Wikipedia) and they are also very common in Recommender Systems. You can also compute F1 metric which is an harmonic mean of precision and recall. You'll see they are very simple formulas and easy enough to implement.
"Evaluating Recommendar Systems" by Guy Shani is a very good paper on how to evaluate recommender systems and will give you a good insight into all this. You can find the paper here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With