Wilson's Confidence Interval for 5 Star Rating

Tags:

rating

Wilson's Confidence Interval takes as arguments the values TRUE or FALSE, or "upvotes" and "downvotes" respectively. From these votes it generates a rating.

For the purpose of my project, I think WCI is perfect. However, the scalar upvote and downvote is not enough to describe the thing I am rating.

That's where 5 star rating comes in, and this is where I need someone to disprove my logic. Now I'm thinking, if I were to implement a 5 star rating with WCI then the following should work without hacking the internals of the confidence interval.

For each star in the rating widget we assign a unique integer value. Each value either counts as a positive (upvote) or negative (downvote). So the following values would be:

1/5 stars: -2 2/5 stars: -1 3/5 stars: 1 4/5 stars: 2 5/5 stars: 3

To summarise the above values. The minimum vote of 1 star is classed as 2 downvotes. A vote of 2 stars is classed as 1 down vote. For the medium vote of 3 stars we give 1 upvote. For 4 stars we give 2 upvotes. And for the maximum of 5 stars we give 3 upvotes.

Please, disprove this logic, why won't this work? Maybe it goes against the "average person's understanding" of a star rating system?

672

asked Oct 26 '13 23:10

Michael Rich

3 Answers

It's easy to think of the following 'workaround' which converts a multi-ranking system to the binary 'upvote/downvote'-style ranking (that can then be scored using the lower bound of Wilson score confidence interval):

Let's say you have the popular 5 star rating system. So we have a number of votes, each having a value of: 1, 2, 3, 4 or 5.

To 'convert' these ratings to up/down votes, use the following rule:

For star rating -- Add

*     - 0.00 to up votes and 1.00 to down votes (i.e. a full down vote)
**    - 0.25 to up votes and 0.75 to down votes
***   - 0.50 to up votes and 0.50 to down votes
****  - 0.75 to up votes and 0.25 to down votes
***** - 1.00 to up votes and 0.00 to down votes (i.e. a full up vote)

After we reduce the 5 star ratings to up/down ratings, we can proceed with the usual score calculations described in Evan Miller's article.

As I am not a statistician or mathematician and I would love to hear from other people if this makes sense or not and what might be the issues with this approach.

124

answered Oct 17 '22 03:10

Nikolay Suvandzhiev

First, try to understand what is the intuition behind WCI. Or, even simpler, Normal approximation interval ( http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval ).

The intuition behind all this interval calculation is simple. You calculate a sample mean and the standard deviation. Interval is mean+-z*std.

In your case calculating mean is simple. It is the mean of ratings itself. Assume p1 is the fraction of 1-star rating, p2,..., p5. p1+p2+...+p5 = 1. And assume you are calculating these stats using n samples. mean of your data is 1*p1+2*p2+...+5*p5.

The variance of your data is ( E(x^2)-(E(x))^2 )/n = ( (p1*1^2 + p2*2^2..+p5*5^2) - (1*p1+2*p2+..+5*p5)^2 )/n

Since std = sqrt(var), it is pretty straightforward to calculate Normal approximation interval. I will let you work on extending this to WCI.

answered Oct 17 '22 05:10

ElKamina

The biggest problem with this scheme is that a single 5-star rating will weigh as much as 3 2-star ratings. And also, an item with 300 3-star ratings (which should be a mediocre score) will have the same score as an item with 100 5-star ratings (which should be a perfect score).

What you could do is calculate a Wilson confidence interval for each possible score. The lower bound of each interval is then the weight of that score towards the (weighted) average.

answered Oct 17 '22 04:10

Apocalisp

Related questions
                            
                                Algorithm for joining circles into a polygon
                            
                                Prefix search against half a billion strings
                            
                                Path finding Algorithms : A* Vs Jump Point Search
                            
                                Longest substring where every character appear even number of times (possibly zero)
                            
                                Move duplicates to the end of a sorted array
                            
                                Algorithmic issue: determining "user sessions"
                            
                                Online algorithm for calculating absolute deviation
                            
                                Finding all cycles in an undirected graph
                            
                                The complexity of verifying solutions to NP-hard optimization problems?
                            
                                Good books and resources on data parallel programming and algorithms [closed]
                            
                                Sorting algorithm of Arrays in Java.util package
                            
                                Why doesn't max-priority queue have DECREASE-KEY?
                            
                                Algorithm to find the minimum number of rectangles covering certain elements in a 2d array [duplicate]
                            
                                Specific shuffling list in Python
                            
                                What's the simplest algorithm/solution for a single pair shortest path through a real-weighted undirected graph?
                            
                                What is the most efficient way to find the euclidean distance in 3d using mysql?
                            
                                Suggest an algorithm (graph - possibly NP-Complete)
                            
                                How to smooth the blocks of a 3D voxel world?
                            
                                Writing a 36 bit random number generator
                            
                                How to modify dijkstra algorithm to find all possible paths?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With