I'm training Machine Learning models on Python and using R squared metric from Scikit Learn to evaluate them. Id decided to play around with Scikit's r2_score function, feeding it a random array of same value as input y_true and and slightly different but same value array as y_predict. I was getting arbitrarily large (negative) values when the input length of array is 10 or more and 0 when the input length is less than 10.
from sklearn.metrics import r2_score
r2_score([213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667,
213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667],
[213, 214, 214, 214, 214, 214, 214, 214, 214, 214])
>>> -1.1175847590636849e+26
r2_score([213.91666667, 213.91666667, 213.91666667, 213.91666667,
213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667],
[213, 214, 214, 214, 214, 214, 214, 214, 214])
>>> 0
You're correct in noting that the r2_score
output is not correct. However, this is a result of a simpler computation issue rather than a problem with the scikit-learn package.
Try running
>>> input_list = [213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667,
213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667]
>>> sum(input_list)/len(input_list)
As you can see, the output is not exactly 213.91666667 (a limited precision error; you can read more about it here). Why does this matter?
Well, the section of the scikit-learn User Guide gives the specific formula used to calculate r2_score
:
As you can see, the r2_score
is simply 1 - (residual sum of squares)/(total sum of squares).
In the first case you specify, the residual sum of squares is equal to some number that...doesn't really matter. You can calculate it easily; it's about 0.09, which doesn't seem super high. However, due to the floating point error described above, the total sum of squares isn't exactly 0, but rather some very, very small number (think around 10^-28 -- very small).
Thus, when you divide residual sum of squares (around 0.09) by total sum of squares (a very small number), you're left with a very large number. Since that large number is subtracted from 1, you are left with a negative number of high magnitude as your r2_score
output.
This imprecision in the calculation of total sum of squares does not occur in the second case, so the denominator is 0 and the function, seeing an undefined value from of the calculations, should return 0.
Looking at the source code of r2_score, we can see the following lines (default weights assigned)
weight = 1
sample_weight = None
y_true = np.array([213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667]).reshape(-1,1)
y_pred = np.array([213, 214, 214, 214, 214, 214, 214, 214, 214, 214]).reshape(-1,1)
numerator = (weight * (y_true - y_pred) ** 2).sum(axis=0,
dtype=np.float64)
denominator = (weight * (y_true - np.average(
y_true, axis=0, weights=sample_weight)) ** 2).sum(axis=0,
dtype=np.float64)
nonzero_denominator = denominator != 0
nonzero_numerator = numerator != 0
valid_score = nonzero_denominator & nonzero_numerator
output_scores = np.ones([y_true.shape[1]])
output_scores[valid_score] = 1 - (numerator[valid_score] /
denominator[valid_score])
return np.average(output_scores, weights=None)
The problematic line in your case is the denominator
calculation.
For the first case:
denominator = (weight * (y_true - np.average(
y_true, axis=0, weights=sample_weight)) ** 2).sum(axis=0,
dtype=np.float64)
print(denominator)
[ 8.07793567e-27]
Its pretty small, but not 0.
For second case: its 0.
Since the denominator is 0, the r2_score is undefined and returns 0. Hope I'm clear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With