Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I get the shape of a numpy save file without reading the entire contents (e.g. memmap)

Tags:

python

numpy

I have a large numpy file saved to disk, I would like to determine the shape without reading in the entire file.

I can get the shape using np.load(filename), but when I try the same using np.memmap appears to require that I know the shape in advance, otherwise it defaults to reading the file in a flat array of uint8 values.

Is this possible to do?

like image 821
David Parks Avatar asked Feb 02 '18 21:02

David Parks


2 Answers

Yes, you will find the shape in plain-text in the first line of the file:

>>> a = np.random.rand(4,7)
>>> np.save('/tmp/a', a)
>>>
$ head -1 /tmp/a.npy
�NUMPYv{'descr': '<f8', 'fortran_order': False, 'shape': (4, 7), }

Here is the code to parse this header:

>>> with open('/tmp/a.npy', 'rb') as f:
...     major, minor = np.lib.format.read_magic(f)
...     shape, fortran, dtype = np.lib.format.read_array_header_1_0(f)
...     
>>> shape
(4, 7)
like image 143
wim Avatar answered Oct 19 '22 02:10

wim


np.memmap is generally for binary files, but np.load has the ability to load .npy files in memmap mode. No shape or dtype needed! Try:

mmapped_array = np.load(filename, mmap_mode='r')
like image 4
c-wilson Avatar answered Oct 19 '22 02:10

c-wilson