Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best algorithm to calculate the most scored item?

Tags:

algorithm

math

I have an music items that are scored by users between 1 to 5, and I need a formula to get the 5 most scored items.

But obviously an item that get 3.5 average score from 1000 different users will be more scored then an item thet get 4.9 average score from only 5 users... in other words I think that if an item get attention from people to score it, this indicates that the item is interesting. so in the calculation the votesCount parameter need to have a power. (how much power? I don't sure, and I asking it you to get ideas).

I think that we need the following parameters in the function: votesAverage, votesCount.

like image 329
Fitzchak Yitzchaki Avatar asked Jan 25 '10 18:01

Fitzchak Yitzchaki


3 Answers

Weighted voting for 5-star systems with lots of voters

You can use Bayesian estimates to calculate weighted voting.

IMDb (Internet Movie Database) uses this calculation to determine its IMDb Top 250. (Note: IMDb uses 10 stars but the formulas are identical using 5 stars).

The formula for calculating the Top Rated 250 Titles gives a true Bayesian estimate:

weighted rating (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C

where:

  • R = average for the movie (mean) = (Rating)
  • v = number of votes for the movie = (votes)
  • m = minimum votes required to be listed in the Top 250 (currently 3000)
  • C = the mean vote across the whole report (currently 6.9)

IMDb Reference

Wikipedia Reference

like image 86
Robert Cartaino Avatar answered Oct 17 '22 07:10

Robert Cartaino


The reddit scoring algorithm is probably the best bet if you really want to do it the right way. It's explained in detail here and at a high level by xkcd author Randall here.

The problem is it doesn't really work for five-star ratings which is what you're going for. You should be able to generalize reddit's sorting system to use ratings. Heck, it's probably done somewhere already. I'm going to look for it.

like image 42
Welbog Avatar answered Oct 17 '22 05:10

Welbog


A simple way to balance the system is to add a fixed number of hypothetical users (say the count is H) who all vote for the long-term average A of all your pieces. Say that average is 3; then the formula becomes

Score = (votesCount x votesAverage + H x A) / (votesCount + H)

Now when votesCount grows, the relative impact of the hypothetical average-voters diminishes.

You can set H experimentally, or by thinking about it. E.g. if you think that 20 votes is sufficient to establish relatively strong rating, you could set H=5. Say.

like image 38
Antti Huima Avatar answered Oct 17 '22 07:10

Antti Huima