After reading How Not to Sort by Average Rating, I was curious if anyone has a Python implementation of a Lower bound of Wilson score confidence interval for a Bernoulli parameter?
Reddit uses the Wilson score interval for comment ranking, an explanation and python implementation can be found here
#Rewritten code from /r2/r2/lib/db/_sorts.pyx from math import sqrt def confidence(ups, downs): n = ups + downs if n == 0: return 0 z = 1.0 #1.44 = 85%, 1.96 = 95% phat = float(ups) / n return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))
I think this one has a wrong wilson call, because if you have 1 up 0 down you get NaN because you can't do a sqrt
on the negative value.
The correct one can be found when looking at the ruby example from the article How not to sort by average page:
return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With