Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle memory error while fitting GaussianMixture in sklearn python?

I am trying to fit GaussianMixture using sklearn to a bunch of cat and dog pictures. I feed a numpy array of size (50,30000) where 50 number of data points(25 cats and 25 dog pictures), 30000 is the number of features after I convert each picture to numpy array and resize to (100,100,3). It is throwing memory error. I have 4GB of RAM and 70% used before running this code. Can anyone suggest me how to debug how much memory is used by GaussianMixture fit method in sklearn. Or can anyone provide some code to fit it in batches.

Following is the code

print(img_coll_cat_dog.shape)
print(img_coll_cat_dog.nbytes)
print(img_coll_cat_dog.itemsize)

Result:

(50, 30000)
12000000 bytes
8 

gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
gmix.fit(img_coll_cat_dog)

Following is the error I am getting.

MemoryError                               Traceback (most recent call last)
<ipython-input-32-c0370476a619> in <module>()
      1 gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
----> 2 gmix.fit(img_coll_cat_dog)

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in fit(self, X, y)
    205 
    206             if do_init:
--> 207                 self._initialize_parameters(X, random_state)
    208                 self.lower_bound_ = -np.infty
    209 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in _initialize_parameters(self, X, random_state)
    155                              % self.init_params)
    156 
--> 157         self._initialize(X, resp)
    158 
    159     @abstractmethod

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _initialize(self, X, resp)
    629 
    630         weights, means, covariances = _estimate_gaussian_parameters(
--> 631             X, resp, self.reg_covar, self.covariance_type)
    632         weights /= n_samples
    633 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_parameters(X, resp, reg_covar, covariance_type)
    283                    "diag": _estimate_gaussian_covariances_diag,
    284                    "spherical": _estimate_gaussian_covariances_spherical
--> 285                    }[covariance_type](resp, X, nk, means, reg_covar)
    286     return nk, means, covariances
    287 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_covariances_full(resp, X, nk, means, reg_covar)
    162     """
    163     n_components, n_features = means.shape
--> 164     covariances = np.empty((n_components, n_features, n_features))
    165     for k in range(n_components):
    166         diff = X - means[k]

MemoryError: 

Any help is much appreciated.

like image 359
Surjya Narayana Padhi Avatar asked Sep 25 '17 06:09

Surjya Narayana Padhi


1 Answers

Try to set covariance_type='diag'

like image 147
David Lee Avatar answered Sep 30 '22 14:09

David Lee