Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the way data is stored in *.npy?

Tags:

python

numpy

I'm saving NumPy arrays using numpy.save function. I want other developers to have capability to read data from those file using C language. So I need to know,how numpy organizes binary data in file.OK, it's obvious when I'm saving array of 'i4' but what about array of arrays that contains some structures?Can't find any info in documentation

UPD : lets say tha data is something like :

dt = np.dtype([('outer','(3,)<i4'),('outer2',[('inner','(10,)<i4'),('inner2','f8')])]) 

UPD2 : What about saving "dynamic" data (dtype - object)

import numpy as np a = [0,0,0] b = [0,0] c = [a,b] dtype = np.dtype([('Name', '|S2'), ('objValue', object)]) data = np.zeros(3, dtype) data[0]['objValue'] = a data[1]['objValue'] = b data[2]['objValue'] = c data[0]['Name'] = 'a' data[1]['Name'] = 'b' data[2]['Name'] = 'c'  np.save(r'D:\in.npy', data) 

Is it real to read that thing from C?

like image 404
illegal-immigrant Avatar asked Nov 03 '10 17:11

illegal-immigrant


People also ask

What is NPY file format?

It is a standard binary file format for persisting a single arbitrary NumPy array on a disk. The format stores all of the shape and data type information necessary to reconstruct the array correctly even on another machine with a different architecture.

How does NumPy save data?

You can save your NumPy arrays to CSV files using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format. You must also specify the delimiter; this is the character used to separate each variable in the file, most commonly a comma.

How do I save a file in NPY format?

Save in npy format using Numpy save()npy file we will use the . save() method from Numpy. Running this line of code will save your array to a binary file with the name 'ask_python. npy'.

Why are NPY files so big?

Pandas reads/interprets this as an int64 array (see full. dtype ) as default, which means it needs 8 bytes per element, which leads to a bigger size of the npy-file (most of which are zeros!). in numpy-format you will pay 8 bytes per element.


2 Answers

The npy file format is documented in numpy's NEP 1 — A Simple File Format for NumPy Arrays.

For instance, the code

>>> dt=numpy.dtype([('outer','(3,)<i4'), ...                 ('outer2',[('inner','(10,)<i4'),('inner2','f8')])]) >>> a=numpy.array([((1,2,3),((10,11,12,13,14,15,16,17,18,19),3.14)), ...                ((4,5,6),((-1,-2,-3,-4,-5,-6,-7,-8,-9,-20),6.28))],dt) >>> numpy.save('1.npy', a) 

results in the file:

93 4E 55 4D 50 59                      magic ("\x93NUMPY") 01                                     major version (1) 00                                     minor version (0)  96 00                                  HEADER_LEN (0x0096 = 150) 7B 27 64 65 73 63 72 27  3A 20 5B 28 27 6F 75 74  65 72 27 2C 20 27 3C 69  34 27 2C 20 28 33 2C 29  29 2C 20 28 27 6F 75 74  65 72 32 27 2C 20 5B 28  27 69 6E 6E 65 72 27 2C  20 27 3C 69 34 27 2C 20  28 31 30 2C 29 29 2C 20  28 27 69 6E 6E 65 72 32                Header, describing the data structure 27 2C 20 27 3C 66 38 27                "{'descr': [('outer', '<i4', (3,)), 29 5D 29 5D 2C 20 27 66                            ('outer2', [ 6F 72 74 72 61 6E 5F 6F                               ('inner', '<i4', (10,)),  72 64 65 72 27 3A 20 46                               ('inner2', '<f8')] 61 6C 73 65 2C 20 27 73                            )], 68 61 70 65 27 3A 20 28                  'fortran_order': False, 32 2C 29 2C 20 7D 20 20                  'shape': (2,), }" 20 20 20 20 20 20 20 20  20 20 20 20 20 0A   01 00 00 00 02 00 00 00 03 00 00 00    (1,2,3) 0A 00 00 00 0B 00 00 00 0C 00 00 00 0D 00 00 00 0E 00 00 00 0F 00 00 00 10 00 00 00 11 00 00 00 12 00 00 00 13 00 00 00                            (10,11,12,13,14,15,16,17,18,19) 1F 85 EB 51 B8 1E 09 40                3.14  04 00 00 00 05 00 00 00 06 00 00 00    (4,5,6) FF FF FF FF FE FF FF FF FD FF FF FF FC FF FF FF FB FF FF FF FA FF FF FF F9 FF FF FF F8 FF FF FF F7 FF FF FF  EC FF FF FF                            (-1,-2,-3,-4,-5,-6,-7,-8,-9,-20) 1F 85 EB 51 B8 1E 19 40                6.28 
like image 197
kennytm Avatar answered Sep 27 '22 18:09

kennytm


The format is described in numpy/lib/format.py, where you can also see the Python source code used to load npy files. np.load is defined here.

like image 33
unutbu Avatar answered Sep 27 '22 18:09

unutbu