I´m using the same data but different python libraries to calculate the coefficient of determination R^2. Using stats library and sklearn yield different results.
What is the reason behind this behavior?
# Using stats lineregress
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
print r_value**2
0.956590054918
# Using sklearn
from sklearn.metrics import r2_score
print r2_score(x, y)
0.603933484937
We can import r2_score from sklearn.metrics in Python to compute R 2 score. Code 2: Calculate R2 score for all the above cases. The best possible score is 1 which is obtained when the predicted values are the same as the actual values. R 2 score of baseline model is 0. During the worse cases, R2 score can even be negative.
sklearn.metrics.r2_score¶. R^2 (coefficient of determination) regression score function. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
sklearn.metrics. .r2_score. ¶. R^2 (coefficient of determination) regression score function. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
R 2 indicates the proportion of data points which lie within the line created by the regression equation. A higher value of R 2 is desirable as it indicates better results. We can import r2_score from sklearn.metrics in Python to compute R 2 score.
The r_value
returned by linregress
is the correlation coefficient r of x and y. In general, the squared correlation coefficient r² is not the same as the coefficient of determination R².
The coefficient of determination tells you how well a model fits the data. Thus, r2_score
thinks that x are the true values and y are values predicted by a model.
If your x and y are true and predicted data, R² is what you want. However, if both are measured data you most likely want r² instead.
Details about the correlation coefficient and the coefficient of determination can be found at Wikipedia.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With