Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras MSE definition

I stumbled across the definition of mse in Keras and I can't seem to find an explanation.

def mean_squared_error(y_true, y_pred):
    return K.mean(K.square(y_pred - y_true), axis=-1)

I was expecting the mean to be taken across the batches, which is axis=0, but instead, it is axis=-1.

I also played around with it a little to see if K.mean actually behaves like the numpy.mean. I must have misunderstood something. Can somebody please clarify?

I can't actually take a look inside the cost function at run time right? As far as I know the function is called at compile time, which prevents me from evaluating concrete values.

I mean... imagine doing regression and having a single output neuron and training with a batch size of ten.

>>> import numpy as np
>>> a = np.ones((10, 1))
>>> a
array([[ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.]])
>>> np.mean(a, axis=-1)
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

All it does is flatten the array instead of taking the mean of all the predictions.

like image 948
Nima Mousavi Avatar asked Feb 05 '18 08:02

Nima Mousavi


People also ask

What is the difference between accuracy and MSE?

Accuracy increase with increase in size of the dataset. Similarly found out to be MSE by using Proposed Method with Imputation Technique like Mean, Mode, and Median Imputation. MSE decreases with increase in size of the dataset.

What is the difference between RMSE and MSE?

RMSE is the square root of MSE. MSE is measured in units that are the square of the target variable, while RMSE is measured in the same units as the target variable. Due to its formulation, MSE, just like the squared loss function that it derives from, effectively penalizes larger errors more severely.

What is a good mean squared error?

An ideal Mean Squared Error (MSE) value is 0.0, which means that all predicted values matched the expected values exactly. MSE is most useful when the dataset contains outliers , or unexpected values (too high values or too low values).

How does keras define custom loss function?

Creating custom loss functions in Keras A custom loss function can be created by defining a function that takes the true values and predicted values as required parameters. The function should return an array of losses. The function can then be passed at the compile stage.


2 Answers

K.mean(a, axis=-1) and also np.mean(a, axis=-1) is just taking the mean across the final dimension. Here a is an array with shape (10, 1) and in this case, taking the mean across the final dimension happens to be the same as flattening it to a 1d array of shape (10,). Implementing it like so supports the more general case of e.g. multiple linear regression.

Also, you can inspect the value of nodes in the computation graph at run-time using keras.backend.print_tensor. See answer: Is there any way to debug a value inside a tensor while training on Keras?

Edit: You question appears to be about why the loss doesn't return a single scalar value but instead returns a scalar value for each data-point in the batch. To support sample weighting, Keras losses are expected to return a scalar for each data-point in the batch. See losses documentation and the sample_weight argument of fit for more information. Note specifically: "The actual optimized objective is the [weighted] mean of the output array across all data points."

like image 105
tiao Avatar answered Sep 23 '22 17:09

tiao


The code is as follows:

 def mean_squared_error(y_true, y_pred):
     return K.mean(K.square(y_pred - y_true), axis=-1)

One application for choosing the axis to be -1 is for example, for colored picture, it has 3 layers RGB. Each layer has size 512 times 512 pixels and they are stored in an object of size 512 times 512 times 3.

Suppose your task involves reconstructing the picture and you store in another object of size 512 times 512 times 3.

Calling the MSE would enable you to analyze how good is your reconstruction task at each pixel. The output would be of size 512 times 512, summarizing your performance at each pixel.

like image 37
Siong Thye Goh Avatar answered Sep 24 '22 17:09

Siong Thye Goh