Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python implementation of the Wilson Score Interval?

After reading How Not to Sort by Average Rating, I was curious if anyone has a Python implementation of a Lower bound of Wilson score confidence interval for a Bernoulli parameter?

like image 812
Jeff Bauer Avatar asked Apr 05 '12 13:04

Jeff Bauer


2 Answers

Reddit uses the Wilson score interval for comment ranking, an explanation and python implementation can be found here

#Rewritten code from /r2/r2/lib/db/_sorts.pyx  from math import sqrt  def confidence(ups, downs):     n = ups + downs      if n == 0:         return 0      z = 1.0 #1.44 = 85%, 1.96 = 95%     phat = float(ups) / n     return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)) 
like image 117
Steef Avatar answered Sep 29 '22 01:09

Steef


I think this one has a wrong wilson call, because if you have 1 up 0 down you get NaN because you can't do a sqrt on the negative value.

The correct one can be found when looking at the ruby example from the article How not to sort by average page:

return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)) 
like image 42
Gullevek Avatar answered Sep 29 '22 03:09

Gullevek