Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Very large numpy array doesn't throw memory error. Where does it live? [duplicate]

So I have the following numpy array:

X = np.zeros((1000000000, 3000), dtype=np.float32)

X.nbytes returns 12000000000000, which is 12 TB.

I certainly don't have that much memory (8GB to be exact). How did this happen? Where is the array allocated?

like image 218
user3813674 Avatar asked Oct 19 '17 03:10

user3813674


2 Answers

I guess you are using Mac. OSX will automatically use all available disk space as virtual memory. So maybe you have a biiiiiiig disk?

This code causes MemoryError on linux.

like image 111
Sraw Avatar answered Nov 16 '22 01:11

Sraw


I ran this on my Mac (OS 10.13, 16GB RAM, 512GB SSD) and had the same, successful results that you did.

This comment seems like a possible answer. In summary: since you're using zeros(), there's no need to have each cell of the matrix take up 4 bytes when they all have the same value. Rather, behind the scenes, numpy may be explicitly storing in memory all values in the matrix that are not equal to a common value (in this case, zero).

Worth noting that running np.random.rand(1000000000, 3000) causes some havoc on my Mac, which does the same thing as zeros() but fills the matrix with actual data. RAM gets maxed out, then begins to use the swap partition.

Before np.random.rand(): Before np.random

During np.random.rand(): During np.random

like image 45
Matt Popovich Avatar answered Nov 16 '22 02:11

Matt Popovich