I have a matrix X
that has something like 7000 columns and 38000 rows. Thus it is a numpy array
with shape (38000, 7000)
.
I instantiated the model
model = RidgeCV(alphas = (0.001,0.01, 0.1, 1)
and then fitted it
model.fit(X, y)
where y
is the response vector which is a numpy array with shape (38000,)
.
By running this I get a Memory Error
.
How can I solve this?
My Idea
My first thought was to divide the matrix X "horizontally". By this I mean that I divide X into, say, two matrices with the same number of columns (thus keeping all the features) but with fewer rows. Then I fit the model each time for each of this submatrices? But I am afraid that this is really not equivalent to fitting the whole matrix..
Any ideas?
It is a well known issue that can be address using out-of-core learning. By googling the term you will find several ways to address the problem.
For your specific problem, you have first to create a generator that will yield a row (or several of them) of your matrix and than using the partial_fit
method of your algorithm.
Standard algorithms of scikit-learn use actually an exact computation of the solution, like sklearn.linear_model.LinearRegression
or sklearn.linear_model.LinearRegression.RidgeCV
. Other methods are based on batch learning and have a partial_fit
methods like sklearn.linear_model.SGDRegressor
, allowing to fit only a mini-batch. It is what you are looking for.
The process is: use the generator to yield a mini-batch, apply the partial_fit
method, delete the mini-batch from the memory and get a new one.
However, as this method is stochastic and depends of the order of your data and your initialization of the weights, at the opposite of the solution given by the standard regression methods that can fit all the data in the memory. I won't enter into the details but look at gradient descent optimization to understand how it works (http://ruder.io/optimizing-gradient-descent/)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With