save numpy array in append mode

Tags:

Is it possible to save a numpy array appending it to an already existing npy-file --- something like np.save(filename,arr,mode='a')?

I have several functions that have to iterate over the rows of a large array. I cannot create the array at once because of memory constrains. To avoid to create the rows over and over again, I wanted to create each row once and save it to file appending it to the previous row in the file. Later I could load the npy-file in mmap_mode, accessing the slices when needed.

349

asked May 21 '15 14:05

user3820991

2 Answers

The build-in .npy file format is perfectly fine for working with small datasets, without relying on external modules other then numpy.

However, when you start having large amounts of data, the use of a file format, such as HDF5, designed to handle such datasets, is to be preferred [1].

For instance, below is a solution to save numpy arrays in HDF5 with PyTables,

Step 1: Create an extendable EArray storage

import tables import numpy as np  filename = 'outarray.h5' ROW_SIZE = 100 NUM_COLUMNS = 200  f = tables.open_file(filename, mode='w') atom = tables.Float64Atom()  array_c = f.create_earray(f.root, 'data', atom, (0, ROW_SIZE))  for idx in range(NUM_COLUMNS):     x = np.random.rand(1, ROW_SIZE)     array_c.append(x) f.close()

Step 2: Append rows to an existing dataset (if needed)

f = tables.open_file(filename, mode='a') f.root.data.append(x)

Step 3: Read back a subset of the data

f = tables.open_file(filename, mode='r') print(f.root.data[1:10,2:20]) # e.g. read from disk only this part of the dataset

100

answered Oct 10 '22 02:10

rth

This is an expansion on Mohit Pandey's answer showing a full save / load example. It was tested using Python 3.6 and Numpy 1.11.3.

from pathlib import Path import numpy as np import os  p = Path('temp.npy') with p.open('ab') as f:     np.save(f, np.zeros(2))     np.save(f, np.ones(2))  with p.open('rb') as f:     fsz = os.fstat(f.fileno()).st_size     out = np.load(f)     while f.tell() < fsz:         out = np.vstack((out, np.load(f)))

out = array([[ 0., 0.], [ 1., 1.]])

answered Oct 10 '22 02:10

PaxRomana99

Related questions
                            
                                How do I access the properties of a many-to-many "through" table from a django template?
                            
                                Opening a process with Popen and getting the PID
                            
                                What is the best way to open a file for exclusive access in Python?
                            
                                Conversion from JavaScript to Python code? [closed]
                            
                                class Classname(object), what sort of word is 'object' in Python?
                            
                                Compile the Python interpreter statically?
                            
                                Error loading DLL in python, not a valid win32 application [duplicate]
                            
                                Set LD_LIBRARY_PATH before importing in python
                            
                                What does the standard Keras model output mean? What is epoch and loss in Keras?
                            
                                Optional dependencies in a pip requirements file
                            
                                How to set the pandas dataframe data left/right alignment?
                            
                                Python Multiprocessing Exit Elegantly How?
                            
                                What does "SSLError: [SSL] PEM lib (_ssl.c:2532)" mean using the Python ssl library?
                            
                                Python can't find my module
                            
                                Running interactive commands in Paramiko
                            
                                Very strange behavior of operator 'is' with methods
                            
                                How do I make coverage include not tested files?
                            
                                Concatenation of the result of a function with a mutable default argument
                            
                                Python decorator as a staticmethod
                            
                                What are the URL parameters? (element at position #3 in urlparse result)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

save numpy array in append mode

Tags:

python

save

numpy

user3820991

People also ask

2 Answers

rth

PaxRomana99

Recent Activity

Donate For Us