How to balance number of ratings versus the ratings themselves?

Tags:

For a school project, we'll have to implement a ranking system. However, we figured that a dumb rank average would suck: something that one user ranked 5 stars would have a better average that something 188 users ranked 4 stars, and that's just stupid.

So I'm wondering if any of you have an example algorithm of "smart" ranking. It only needs to take in account the rankings given and the number of rankings.

Thanks!

202

asked Mar 22 '10 20:03

zneak

1 Answers

You can use a method inspired by Bayesian probability. The gist of the approach is to have an initial belief about the true rating of an item, and use users' ratings to update your belief.

This approach requires two parameters:

What do you think is the true "default" rating of an item, if you have no ratings at all for the item? Call this number R, the "initial belief".
How much weight do you give to the initial belief, compared to the user ratings? Call this W, where the initial belief is "worth" W user ratings of that value.

With the parameters R and W, computing the new rating is simple: assume you have W ratings of value R along with any user ratings, and compute the average. For example, if R = 2 and W = 3, we compute the final score for various scenarios below:

100 (user) ratings of 4: (3*2 + 100*4) / (3 + 100) = 3.94
3 ratings of 5 and 1 rating of 4: (3*2 + 3*5 + 1*4) / (3 + 3 + 1) = 3.57
10 ratings of 4: (3*2 + 10*4) / (3 + 10) = 3.54
1 rating of 5: (3*2 + 1*5) / (3 + 1) = 2.75
No user ratings: (3*2 + 0) / (3 + 0) = 2
1 rating of 1: (3*2 + 1*1) / (3 + 1) = 1.75

This computation takes into consideration the number of user ratings, and the values of those ratings. As a result, the final score roughly corresponds to how happy one can expect to be about a particular item, given the data.

Choosing `R`

When you choose R, think about what value you would be comfortable assuming for an item with no ratings. Is the typical no-rating item actually 2.4 out of 5, if you were to instantly have everyone rate it? If so, R = 2.4 would be a reasonable choice.

You should not use the minimum value on the rating scale for this parameter, since an item rated extremely poorly by users should end up "worse" than a default item with no ratings.

If you want to pick R using data rather than just intuition, you can use the following method:

Consider all items with at least some threshold of user ratings (so you can be confident that the average user rating is reasonably accurate).
For each item, assume its "true score" is the average user rating.
Choose R to be the median of those scores.

If you want to be slightly more optimistic or pessimistic about a no-rating item, you can choose R to be a different percentile of the scores, for instance the 60th percentile (optimistic) or 40th percentile (pessimistic).

Choosing `W`

The choice of W should depend on how many ratings a typical item has, and how consistent ratings are. W can be higher if items naturally obtain many ratings, and W should be higher if you have less confidence in user ratings (e.g., if you have high spammer activity). Note that W does not have to be an integer, and can be less than 1.

Choosing W is a more subjective matter than choosing R. However, here are some guidelines:

If a typical item obtains C ratings, then W should not exceed C, or else the final score will be more dependent on R than on the actual user ratings. Instead, W should be close to a fraction of C, perhaps between C/20 and C/5 (depending on how noisy or "spammy" ratings are).
If historical ratings are usually consistent (for an individual item), then W should be relatively small. On the other hand, if ratings for an item vary wildly, then W should be relatively large. You can think of this algorithm as "absorbing" W ratings that are abnormally high or low, turning those ratings into more moderate ones.
In the extreme, setting W = 0 is equivalent to using only the average of user ratings. Setting W = infinity is equivalent to proclaiming that every item has a true rating of R, regardless of the user ratings. Clearly, neither of these extremes are appropriate.
Setting W too large can have the effect of favoring an item with many moderately-high ratings over an item with slightly fewer exceptionally-high ratings.

148

answered Oct 15 '22 18:10

k_ssb

Related questions
                            
                                Running time of sorting with a black-box findmax subroutine
                            
                                Change priority of items in a priority queue
                            
                                Graph - Square of a directed graph
                            
                                graph - What are the differences between Embedded and Topological in Graph?
                            
                                Is there a well known algorithm fill in the grid given a set of points?
                            
                                k-way triangle set intersection and triangulation
                            
                                Determining approximate overlaps of a given polyline with a set of existing polylines
                            
                                Difference between two products nearest to zero: non brute-force solution?
                            
                                Algorithm for reading image as lines (then get a result of them)?
                            
                                What's textmate's 'Go to File' fuzzy search algorithm?
                            
                                Suitable choice of data structure and algorithm for fast k-Nearest Neighbor search in 2D
                            
                                Is there any way to efficiently reconstruct a collection based on a sequence of inserts/removals?
                            
                                optimal negative space between rectangles algorithm?
                            
                                How Does Facebook Determine "Suggested Friends"? [closed]
                            
                                What's the equivalent 'nth_element' function in Java?
                            
                                dynamic programming algorithm during an interview [closed]
                            
                                Snake-alike fluid layout algorithm
                            
                                efficiently find the first element matching a bit mask
                            
                                Avoid collision between nodes and edges in D3 force layout
                            
                                How can I figure out which tiles move and merge in my implementation of 2048?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to balance number of ratings versus the ratings themselves?

Tags:

language-agnostic

algorithm

ranking

zneak

People also ask

1 Answers

Choosing `R`

Choosing `W`

k_ssb

Recent Activity

Donate For Us

How to balance number of ratings versus the ratings themselves?

Tags:

language-agnostic

algorithm

ranking

zneak

People also ask

1 Answers

Choosing R

Choosing W

k_ssb

Related questions

Recent Activity

Donate For Us

Choosing `R`

Choosing `W`