I want to do sparse high dimensional (a few thousand features) least squares regression with a few hundred thousands of examples. I'm happy to use non fancy optimisation - stochastic gradient descent is fine.
Does anyone know of any software already implemented for doing this, so I don't have to write to my own?
Kind regards.
The least square method is the process of finding the best-fitting curve or line of best fit for a set of data points by reducing the sum of the squares of the offsets (residual part) of the points from the curve.
The least squares method is a statistical procedure to find the best fit for a set of data points by minimizing the sum of the offsets or residuals of points from the plotted curve. Least squares regression is used to predict the behavior of dependent variables.
While I don't know for sure, this strikes me as the kind of thing that LAPACK (linear algebra package) would be able to provide support for. They are typically interested in large matrix math, incluing sparse matrices and out-of-core sizes. The basic version is FORTRAN, but there are ports of the libraries for C and other languages.
As LAPACK uses BLAS (basic linear algebra subprograms) for many of its underlying calls, you will probably also want to check out Sparse BLAS.
I'm pretty sure that the R package can be used for problems like this. It's incredibly powerful and flexible. Lots of online resources linked from that page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With