I have a 3.374Gb npz file, myfile.npz
.
I can read it in and view the filenames:
a = np.load('myfile.npz')
a.files
gives
['arr_1','arr_0']
I can read in 'arr_1' ok
a1=a['arr_1']
However, I cannot load in arr_0
, or read its shape:
a1=a['arr_0']
a['arr_0'].shape
both above operations give the following error:
ValueError: array is too big
I have 16Gb RAM of which 8.370Gb is available. So the problem doesn't seem related to memory. My questions are:
Should I be able to read this file in?
Can anyone explain this error?
I have been looking at using np.memmap
to get around this - is this a reasonable approach?
What debugging approach should I use?
EDIT:
I got access to a computer with more RAM (48GB) and it loaded. The dtype
was in fact complex128
and the uncompressed memory of a['arr_0']
was 5750784000 bytes. It seems that a RAM overhead may be required. Either that or my predicted amount of available RAM was wrong (I used windows sysinternals RAMmap).
Sometimes, we need to deal with NumPy arrays that are too big to fit in the system memory. A common solution is to use memory mapping and implement out-of-core computations. The array is stored in a file on the hard drive, and we create a memory-mapped object to this file that can be used as a regular NumPy array.
With the help of Numpy numpy. resize(), we can resize the size of an array. Array can be of any shape but to resize it we just need the size i.e (2, 2), (2, 3) and many more. During resizing numpy append zeros if values at a particular place is missing.
NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original. The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.
In Python, numpy. size() function count the number of elements along a given axis. Parameters: arr: [array_like] Input data.
An np.complex128
array with dimensions (200, 1440, 3, 13, 32)
ought to take up about 5.35GiB uncompressed, so if you really did have 8.3GB of free, addressable memory then in principle you ought to be able to load the array.
However, based on your responses in the comments below, you are using 32 bit versions of Python and numpy. In Windows, a 32 bit process can only address up to 2GB of memory (or 4GB if the binary was compiled with the IMAGE_FILE_LARGE_ADDRESS_AWARE
flag; most 32 bit Python distributions are not). Consequently, your Python process is limited to 2GB of address space regardless of how much physical memory you have.
You can either install 64 bit versions of Python, numpy, and any other Python libraries you need, or live with the 2GB limit and try to work around it. In the latter case you might get away with storing arrays that exceed the 2GB limit mainly on disk (e.g. using np.memmap
), but I'd advise you to go for option #1, since operations on memmaped arrays are a lot slower in most cases than for normal np.array
s that reside wholly in RAM.
If you already have another machine that has enough RAM to load the whole array into core memory then I would suggest you save the array in a different format (either as a plain np.memmap
binary, or perhaps better, in an HDF5 file using PyTables or H5py). It's also possible (although slightly trickier) to extract the problem array from the .npz
file without loading it into RAM, so that you can then open it as an np.memmap
array residing on disk:
import numpy as np
# some random sparse (compressible) data
x = np.random.RandomState(0).binomial(1, 0.25, (1000, 1000))
# save it as a compressed .npz file
np.savez_compressed('x_compressed.npz', x=x)
# now load it as a numpy.lib.npyio.NpzFile object
obj = np.load('x_compressed.npz')
# contains a list of the stored arrays in the format '<name>.npy'
namelist = obj.zip.namelist()
# extract 'x.npy' into the current directory
obj.zip.extract(namelist[0])
# now we can open the array as a memmap
x_memmap = np.load(namelist[0], mmap_mode='r+')
# check that x and x_memmap are identical
assert np.all(x == x_memmap[:])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With