Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow: implementing Mean Squared Error

I'm currently learning TensorFlow and came across with this notebook.

I have a question with how the mean squared error cost function is implemented:

import tensorflow as tf 
import numpy as np 

predicted = np.array([1,2,3])
Y = np.array([4,5,6])
num_instances = predicted.shape[0]

cost = tf.reduce_sum(tf.pow(predicted-Y, 2))/(2*num_instances)
cost2 = tf.reduce_mean(tf.square(predicted - Y))

with tf.Session() as sess:
  print(sess.run(cost))
  print(sess.run(cost2))

I don't get it why does it have to multiply the denominator of 1st cost function to 2. I got different answers from the different implementations of MSE, cost yields 4.5 while cost2 yields 9. following the formula of the mean squared error, I should get a value of 9. but the 1st cost function is the one that is implemented in the python notebook that I'm trying to learn.

like image 892
ZZZZZZZZZ Avatar asked Mar 31 '26 21:03

ZZZZZZZZZ


1 Answers

The difference between cost and cost2 is exactly 2 in 2*num_instances. Basically,

cost = tf.reduce_sum(tf.pow(predicted-Y, 2))/(2*num_instances)
cost2 = tf.reduce_sum(tf.pow(predicted-Y, 2))/(num_instances)

The scalar 2 doesn't affect the learning that much, it's equivalent to multiplying the learning rate by 2. Note that whatever formula and network topology you use, you still need to select reasonable hyperparameters, including the learning rate.

You can try to inspect the convergence of both loss functions, I suspect they perform the same. This means both formulas are fine, the second one is just easier to implement.

like image 186
Maxim Avatar answered Apr 02 '26 13:04

Maxim



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!