Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why rank-based recommendation use NDCG?

rank-based recommendation system use NDCG to evaluate Recommendation accuracy. However, sometimes Accuracy rate and recall rate are used to evaluate top-n recommendation. Does it mean when NDCG is high, accuracy rate is high? But I run a ListRankMF algorithm, the accuracy rate is very low on movelens 100k dataset, just about 8%. What's the relation between NDCG and accuracy rate?

like image 887
Try Leung Avatar asked Dec 13 '15 14:12

Try Leung


People also ask

What is NDCG in recommendation system?

The full form of nDCG is “Normalised Discounted Cumulative Gain” which is a measure by which we can measure the ranking quality. This metric was developed to evaluate a recommendation system and is compatible with Python DataFrame.

What is NDCG information retrieval?

NDCG is a measure of ranking quality. In Information Retrieval, such measures assess the document retrieval algorithms. In this article, we will cover the following: Justification for using a measure for ranking quality to evaluate a recommendation engine.

How do you interpret NDCG values?

en.wikipedia.org/wiki/Discounted_cumulative_gain nDCG is there so that the values fall between 0 and 1 and has "natural" interpretation. If so, the score of 1 means that the order of hits in a search is perfectly ordered by relevance while 0 is the opposite. 0.5 means half the hits are ordered ok.

What does NDCG stand for?

NDCG: Normalized Discounted Cumulative Gain.


1 Answers

NDCG is most helpful when the objective of the recommender system is to return some relevant results, and order is important. For example, recommending a translation, or recommending a bank account. It's not harmful if we miss relevant results, but for a good user experience we want them in a meaningful order.

Recall is most helpful when the objective of the recommender system is to return all relevant results, and order is unimportant. For example, a potential medical diagnosis or prescription. It is harmful if we miss a relevant results, since that might be the correct diagnosis or cure. The order is not important since we expect the medic to read through all the possibilities and use their expert knowledge for the final decision.

Suppose there are 5 drugs we could recommend a doctor to give a patient (A to E), and 5 that we should not recommend (F to J). Our recommender system outputs the recommendations A,B,C,D. This gives us the following evaluations:

  • NDCG = 1.0
  • Recall = 0.8

In this case recall clearly shows we did not do as well as we could (since we did not recommend drug E), whereas NDCG is leads us to believe we made the perfect recommendations.

If we were instead recommending books, then NDCG would be more appropriate. Recall is not so informative since there may be hundreds of relevant books, but we cannot expect a user to read through a list of hundreds of books to pick just one to read. NDCG would tell us if we are at least recommending some meaningful subset of what is possible.

like image 161
Ben Horsburgh Avatar answered Oct 27 '22 07:10

Ben Horsburgh