Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy memory error creating huge matrix

I am using numpy and trying to create a huge matrix. While doing this, I receive a memory error

Because the matrix is not important, I will just show the way how to easily reproduce the error.

a = 10000000000
data = np.array([float('nan')] * a)

not surprisingly, this throws me MemoryError

There are two things I would like to tell:

  1. I really need to create and to use a big matrix
  2. I think I have enough RAM to handle this matrix (I have 24 Gb or RAM)

Is there an easy way to handle big matrices in numpy?

Just to be on the safe side, I previously read these posts (which sounds similar):

Very large matrices using Python and NumPy

Python/Numpy MemoryError

Processing a very very big data set in python - memory error

P.S. apparently I have some problems with multiplication and division of numbers, which made me think that I have enough memory. So I think it is time for me to go to sleep, review math and may be to buy some memory.

May be during this time some genius might come up with idea how to actually create this matrix using only 24 Gb of Ram.

Why I need this big matrix I am not going to do any manipulations with this matrix. All I need to do with it is to save it into pytables.

like image 768
Salvador Dali Avatar asked Sep 30 '13 00:09

Salvador Dali


1 Answers

If you can't afford creating such a matrix, but still wish to do some computations, try sparse matrices.

If you wish to pass it to another Python package that uses duck typing, you may create your own class with __getitem__ implementing dummy access.

like image 87
Tigran Saluev Avatar answered Sep 30 '22 19:09

Tigran Saluev