I really can't understand the following equation, especially 1/(2m)
.
What's the purpose of this equation? And where does 1/(2m)
came from?
J(theta_0, theta_1) = 1/(2m) * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2
Please explain. How it casts???
For the Linear regression model, the cost function will be the minimum of the Root Mean Squared Error of the model, obtained by subtracting the predicted values from actual values. The cost function will be the minimum of these error values.
The Cost Function of Linear Regression: Cost function measures how a machine learning model performs. Cost function is the calculation of the error between predicted values and actual values, represented as a single real number.
The hypothesis of logistic regression tends it to limit the cost function between 0 and 1 . Therefore linear functions fail to represent it as it can have a value greater than 1 or less than 0 which is not possible as per the hypothesis of logistic regression.
In practical terms, linear regression is useful even if you are also using a more complex model for your work. The key is that linear regression is easy to understand and therefore easy to use to conceptually understand what is happening in more complex models.
The cost function is
J(theta_0, theta_1) = 1/(2m) * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2
By h_theta(x^i)
we denote what model outputs for x^i
, so h_theta(x^i) - y^i
is its error (assuming, that y^i
is a correct output).
Now, we calculate the square of this error [ h_theta(x^i) - y^i ]^2
(which removes the sign, as this error could be both positive and negative) and sum it over all samples, and to bound it somehow we normalize it - simply by dividing by m
, so we have mean (because we devide by number of samples) squared (because we square) error (because we compute an error):
1/m * sum_(i=1)^m [ h_theta(x^i) - y^i ]^2
This 2
which appears in the front is used only for simplification of the derivative, because when you will try to minimize it, you will use the steepest descent method, which is based on the derivative of this function. Derivative of a^2
is 2a
, and our function is a square of something, so this 2
will cancel out. This is the only reason of its existance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With