Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gradient Descent Linear Regression in Java

This a bit of a long shot, but I wonder if someone could look at this. Am I doing Batch Gradient descent for linear regression correctly here? It gives the expected answers for a single independent and intercept, but not for multiple independent variables.

/**
 * (using Colt Matrix library)
 * @param alpha Learning Rate
 * @param thetas Current Thetas
 * @param independent 
 * @param dependent
 * @return new Thetas
 */
public DoubleMatrix1D descent(double         alpha,
                              DoubleMatrix1D thetas,
                              DoubleMatrix2D independent,
                              DoubleMatrix1D dependent ) {
    Algebra algebra     = new Algebra();

    // ALPHA*(1/M) in one.
    double  modifier    = alpha / (double)independent.rows();

    //I think this can just skip the transpose of theta.
    //This is the result of every Xi run through the theta (hypothesis fn)
    //So each Xj feature is multiplied by its Theata, to get the results of the hypothesis
    DoubleMatrix1D hypothesies = algebra.mult( independent, thetas );

    //hypothesis - Y  
    //Now we have for each Xi, the difference between predictect by the hypothesis and the actual Yi
    hypothesies.assign(dependent, Functions.minus);

    //Transpose Examples(MxN) to NxM so we can matrix multiply by hypothesis Nx1
    DoubleMatrix2D transposed = algebra.transpose(independent);

    DoubleMatrix1D deltas     = algebra.mult(transposed, hypothesies );


    // Scale the deltas by 1/m and learning rate alhpa.  (alpha/m)
    deltas.assign(Functions.mult(modifier));

    //Theta = Theta - Deltas
    thetas.assign( deltas, Functions.minus );

    return( thetas );
}
like image 713
Jeremy Avatar asked Jul 01 '26 01:07

Jeremy


1 Answers

There is nothing wrong in your implementation and based on your comment the problem in collinearity which you induce when generating x2. This is problematic in regression estimation.

To test your algorithm, you can generate two independent columns of random numbers. Pick a value of w0, w1 and w2 i.e. coefficients for intercept, x1 and x2 respectively. Calculate the dependent value y.

Then see if your stochastic/batch gradient decent algorithm can recover w0, w1 and w2 values

like image 173
iTech Avatar answered Jul 03 '26 14:07

iTech