I stumbled across the definition of <code>mse</code> in Keras and I can't seem to find an explanation. <pre class="prettyprint"><code>def mean_squared_error(y_true, y_pred): return K.mean(K.square(y_pred - y_true), axis=-1) </code></pre> I was expecting the mean to be taken across the batches, which is <code>axis=0</code>, but instead, it is <code>axis=-1</code>. I also played around with it a little to see if <code>K.mean</code> actually behaves like the <code>numpy.mean</code>. I must have misunderstood something. Can somebody please clarify? I can't actually take a look inside the cost function at run time right? As far as I know the function is called at compile time, which prevents me from evaluating concrete values. I mean... imagine doing regression and having a single output neuron and training with a batch size of ten. <pre class="prettyprint"><code>>>> import numpy as np >>> a = np.ones((10, 1)) >>> a array([[ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.]]) >>> np.mean(a, axis=-1) array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) </code></pre> All it does is flatten the array instead of taking the mean of all the predictions.

The code is as follows: <pre class="prettyprint"><code> def mean_squared_error(y_true, y_pred): return K.mean(K.square(y_pred - y_true), axis=-1) </code></pre> One application for choosing the axis to be -1 is for example, for colored picture, it has 3 layers RGB. Each layer has size 512 times 512 pixels and they are stored in an object of size 512 times 512 times 3. Suppose your task involves reconstructing the picture and you store in another object of size 512 times 512 times 3. Calling the MSE would enable you to analyze how good is your reconstruction task at each pixel. The output would be of size 512 times 512, summarizing your performance at each pixel.

Keras MSE definition

Tags:

python

machine-learning

neural-network

deep-learning

keras

I stumbled across the definition of mse in Keras and I can't seem to find an explanation.

Click to copy

def mean_squared_error(y_true, y_pred):
    return K.mean(K.square(y_pred - y_true), axis=-1)

I was expecting the mean to be taken across the batches, which is axis=0, but instead, it is axis=-1.

I also played around with it a little to see if K.mean actually behaves like the numpy.mean. I must have misunderstood something. Can somebody please clarify?

I can't actually take a look inside the cost function at run time right? As far as I know the function is called at compile time, which prevents me from evaluating concrete values.

I mean... imagine doing regression and having a single output neuron and training with a batch size of ten.

Click to copy

>>> import numpy as np
>>> a = np.ones((10, 1))
>>> a
array([[ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.]])
>>> np.mean(a, axis=-1)
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

All it does is flatten the array instead of taking the mean of all the predictions.

948

asked Feb 05 '18 08:02

Nima Mousavi

2 Answers

K.mean(a, axis=-1) and also np.mean(a, axis=-1) is just taking the mean across the final dimension. Here a is an array with shape (10, 1) and in this case, taking the mean across the final dimension happens to be the same as flattening it to a 1d array of shape (10,). Implementing it like so supports the more general case of e.g. multiple linear regression.

Also, you can inspect the value of nodes in the computation graph at run-time using keras.backend.print_tensor. See answer: Is there any way to debug a value inside a tensor while training on Keras?

Edit: You question appears to be about why the loss doesn't return a single scalar value but instead returns a scalar value for each data-point in the batch. To support sample weighting, Keras losses are expected to return a scalar for each data-point in the batch. See losses documentation and the sample_weight argument of fit for more information. Note specifically: "The actual optimized objective is the [weighted] mean of the output array across all data points."

105

answered Sep 23 '22 17:09

tiao

The code is as follows:

Click to copy

 def mean_squared_error(y_true, y_pred):
     return K.mean(K.square(y_pred - y_true), axis=-1)

One application for choosing the axis to be -1 is for example, for colored picture, it has 3 layers RGB. Each layer has size 512 times 512 pixels and they are stored in an object of size 512 times 512 times 3.

Suppose your task involves reconstructing the picture and you store in another object of size 512 times 512 times 3.

Calling the MSE would enable you to analyze how good is your reconstruction task at each pixel. The output would be of size 512 times 512, summarizing your performance at each pixel.

answered Sep 24 '22 17:09

Siong Thye Goh

Related questions
                            
                                Python/Bash - Get filenames with escaped characters
                            
                                Are Pyspark and Pandas certified to work together? [closed]
                            
                                Python Pandas average based on condition into new column
                            
                                I use to_gbq on pandas for updating Google BigQuery and get GenericGBQException
                            
                                Matplotlib Table- Assign different text alignments to different columns
                            
                                scikit learn: custom classifier compatible with GridSearchCV
                            
                                How can I overload operators so that type on the left/right does not matter?
                            
                                Sum of distances from a point to all other points
                            
                                OSRM giving wrong response for distance between 2 points
                            
                                Socket Java client - Python Server
                            
                                Understanding Scipy Convolution
                            
                                How to tell if a python module is intended to be python 2 or python 3?
                            
                                PyTorch: training with GPU gives worse error than training the same thing with CPU
                            
                                PySpark Numeric Window Group By
                            
                                Pandas Datetime Interval Resample to Seconds
                            
                                Matching strings in a column of a data frame with the strings in a column of another data frame using R or Python
                            
                                Django : Maintaining option selected in HTML template
                            
                                Text Extraction from image after detecting text region with contours
                            
                                What is event_loop_policy and why is it needed in python asyncio?
                            
                                1d CNN audio in keras

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras MSE definition

Tags:

python

machine-learning

neural-network

deep-learning

keras

Nima Mousavi

People also ask

2 Answers

tiao

Siong Thye Goh

Recent Activity

Donate For Us