I have defined a custom RMSE function:
def rmse(y_pred, y_true):
return K.sqrt(K.mean(K.square(y_pred - y_true)))
I was evaluating it against the mean squared error provided by Keras:
keras.losses.mean_squared_error(y_true, y_pred)
The values I get for MSE and RMSE metrics respectively for some (the same) prediction are:
mse: 115.7218 - rmse: 8.0966
Now, when I take the root of the MSE, I get 10.7574
, which is obviously higher than the RMSE the custom RMSE function outputs. I haven't been able to figure out why this is so, nor have I found any related posts on this particular topic. Is there maybe a mistake in the RMSE function that I'm simply not seeing? Or is it somehow related to how Keras defines axis=-1
in the MSE function (purpose of which I haven't fully understood yet)?
Here is where I invoke the RMSE and MSE:
model.compile(loss="mae", optimizer="adam", metrics=["mse", rmse])
So I would expect the root of MSE to be the same as the RMSE.
I originally asked this question on Cross Validated but it was put on hold as off-topic.
RMSE is the square root of MSE. MSE is measured in units that are the square of the target variable, while RMSE is measured in the same units as the target variable. Due to its formulation, MSE, just like the squared loss function that it derives from, effectively penalizes larger errors more severely.
MSE: Which Metric Should You Use? When assessing how well a model fits a dataset, we use the RMSE more often because it is measured in the same units as the response variable. Conversely, the MSE is measured in squared units of the response variable.
RMSE is a good measure of accuracy, but only to compare forecasting errors of different models or model configurations for a particular variable and not between variables, as it is scale-dependent.
The output is the MSE score. At the end, calculate the square root of MSE using math. sqrt() function to get the RMSE value.
Is there maybe a mistake in the RMSE loss function that I'm simply not seeing? Or is it somehow related to how Keras defines axis=-1 in the MSE loss function (purpose of which I haven't fully understood yet)?
When Keras does the loss calculation, the batch dimension is retained which is the reason for axis=-1
. The returned value is a tensor. This is because the loss for each sample may have to be weighted before taking the mean depending on whether certain arguments are passed in the fit()
method like sample_weight
.
I get the same results with both the approaches.
from tensorflow import keras
import numpy as np
from keras import backend as K
def rmse(y_pred, y_true):
return K.sqrt(K.mean(K.square(y_pred - y_true)))
l1 = keras.layers.Input(shape=(32))
l2 = keras.layers.Dense(10)(l1)
model = keras.Model(inputs=l1, outputs=l2)
train_examples = np.random.randn(5,32)
train_labels=np.random.randn(5,10)
MSE approach
model.compile(loss='mse', optimizer='adam')
model.evaluate(train_examples, train_labels)
RMSE approach
model.compile(loss=rmse, optimizer='adam')
model.evaluate(train_examples, train_labels)
Output
5/5 [==============================] - 0s 8ms/sample - loss: 1.9011
5/5 [==============================] - 0s 2ms/sample - loss: 1.3788
sqrt(1.9011) = 1.3788
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With