I'm building a small neural net in Keras meant for a regression task, and I want to use the same accuracy metric as the scikit-learn RandomForestRegressor:
The coefficient R^2 is defined as (1 - u/v), where u is the regression sum of squares ((y_true - y_pred) ** 2).sum() and v is the residual sum of squares ((y_true - y_true.mean()) ** 2).sum().
It's a handy metric because it shows values up to 1.0 (similar to percent accuracy in classification). Is my usage of Keras backend correct for the accuracy metric I want?
def create_model():
model = Sequential()
model.add(Dense(10,
input_dim=X.shape[1],
activation="relu"))
model.add(Dense(10,
activation="relu"))
model.add(Dense(1))
# Compile model
model.compile(loss="mean_squared_error", optimizer="adam", metrics=[det_coeff])
return model
# Is this computing the right thing?
def det_coeff(y_true, y_pred):
u = K.sum(K.square(y_true - y_pred))
v = K.sum(K.square(y_true - K.mean(y_true)))
return K.ones_like(v) - (u / v)
This appears to work in that nothing errors and the metric is increasing towards 1 over time, but I want to make dead sure I implemented the metric correctly. I'm new to Keras backend functions.
The R2 score is a very important metric that is used to evaluate the performance of a regression-based machine learning model. It is pronounced as R squared and is also known as the coefficient of determination. It works by measuring the amount of variance in the predictions explained by the dataset.
Solution. To calculate R2 you need to find the sum of the residuals squared and the total sum of squares. Start off by finding the residuals, which is the distance from regression line to each data point. Work out the predicted y value by plugging in the corresponding x value into the regression line equation.
R-squared is a goodness-of-fit measure for linear regression models. This statistic indicates the percentage of the variance in the dependent variable that the independent variables explain collectively.
What is r2 score? ” …the proportion of the variance in the dependent variable that is predictable from the independent variable(s).” Another definition is “(total variance explained by model) / total variance.” So if it is 100%, the two variables are perfectly correlated, i.e., with no variance at all.
you can check this post out. I tested the following code and it works ok for your purpose.
from keras import backend as K
def coeff_determination(y_true, y_pred):
SS_res = K.sum(K.square( y_true-y_pred ))
SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With