I have a large numpy file saved to disk, I would like to determine the shape without reading in the entire file.
I can get the shape using np.load(filename)
, but when I try the same using np.memmap
appears to require that I know the shape in advance, otherwise it defaults to reading the file in a flat array of uint8
values.
Is this possible to do?
Yes, you will find the shape in plain-text in the first line of the file:
>>> a = np.random.rand(4,7)
>>> np.save('/tmp/a', a)
>>>
$ head -1 /tmp/a.npy
�NUMPYv{'descr': '<f8', 'fortran_order': False, 'shape': (4, 7), }
Here is the code to parse this header:
>>> with open('/tmp/a.npy', 'rb') as f:
... major, minor = np.lib.format.read_magic(f)
... shape, fortran, dtype = np.lib.format.read_array_header_1_0(f)
...
>>> shape
(4, 7)
np.memmap is generally for binary files, but np.load has the ability to load .npy files in memmap mode. No shape or dtype needed! Try:
mmapped_array = np.load(filename, mmap_mode='r')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With