Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to append many numpy files into one numpy file in python

Tags:

python

numpy

I am trying to put many numpy files to get one big numpy file, I tried to follow those two links Append multiple numpy files to one big numpy file in python and Python append multiple files in given order to one big file this is what I did:

import matplotlib.pyplot as plt 
import numpy as np
import glob
import os, sys
fpath ="/home/user/Desktop/OutFileTraces.npy"
npyfilespath ="/home/user/Desktop/test"   
os.chdir(npyfilespath)
with open(fpath,'wb') as f_handle:
    for npfile in glob.glob("*.npy"):
        # Find the path of the file
        filepath = os.path.join(npyfilespath, npfile)
        print filepath
        # Load file
        dataArray= np.load(filepath)
        print dataArray
        np.save(f_handle,dataArray)
        dataArray= np.load(fpath)
        print dataArray

An example of the result that I have:

/home/user/Desktop/Trace=96
[[ 0.01518007  0.01499514  0.01479736 ..., -0.00392216 -0.0039761
  -0.00402747]]
[[-0.00824758 -0.0081808  -0.00811402 ..., -0.0077236  -0.00765425
  -0.00762086]]
/home/user/Desktop/Trace=97
[[ 0.00614908  0.00581004  0.00549154 ..., -0.00814741 -0.00813457
  -0.00809347]]
[[-0.00824758 -0.0081808  -0.00811402 ..., -0.0077236  -0.00765425
  -0.00762086]]
/home/user/Desktop/Trace=98
[[-0.00291786 -0.00309509 -0.00329287 ..., -0.00809861 -0.00797789
  -0.00784175]]
[[-0.00824758 -0.0081808  -0.00811402 ..., -0.0077236  -0.00765425
  -0.00762086]]
/home/user/Desktop/Trace=99
[[-0.00379887 -0.00410453 -0.00438963 ..., -0.03497837 -0.0353842
  -0.03575151]]
[[-0.00824758 -0.0081808  -0.00811402 ..., -0.0077236  -0.00765425
  -0.00762086]

this line represents the first trace:

[[-0.00824758 -0.0081808  -0.00811402 ..., -0.0077236  -0.00765425
      -0.00762086]]

It is repeated all the time.

I asked the second question two days ago, at first I think that I had the best answer, but after trying to model to print and lot the final file 'OutFileTraces.npy' I found that my code:

1/ doesn't print numpy files from folder 'test' with respecting their order(trace0,trace1, trace2,...)

2/ saves only the last trace in the file, I mean by that when print or plot the OutFileTraces.npy, I found just one trace , it is the first one.

So I need to correct my code because really I am blocked. I would be very grateful if you could help me.

Thanks in advance.

like image 998
nass9801 Avatar asked Feb 13 '17 12:02

nass9801


People also ask

How do I combine multiple NumPy arrays?

Use numpy. concatenate() to merge the content of two or multiple arrays into a single array. This function takes several arguments along with the NumPy arrays to concatenate and returns a Numpy array ndarray. Note that this method also takes axis as another argument, when not specified it defaults to 0.

Is appending to NumPy array faster than list?

NumPy Arrays Are NOT Always Faster Than Lists " append() " adds values to the end of both lists and NumPy arrays.

Is appending to NumPy array efficient?

Appending to numpy arrays is very inefficient. This is because the interpreter needs to find and assign memory for the entire array at every single step. Depending on the application, there are much better strategies. If you know the length in advance, it is best to pre-allocate the array using a function like np.

How do I append to an existing NumPy array?

You can use numpy. append() function to add an element in a NumPy array. You can pass the NumPy array and multiple values as arguments to the append() function. It doesn't modify the existing array but returns a copy of the passed array with given values added.


2 Answers

  1. Glob produces unordered lists. You need to sort explicitly with an extra line as the sorting procedure is in-place and does not return the list.

    npfiles = glob.glob("*.npy")
    npfiles.sort()
    for npfile in npfiles:
        ...
    
  2. NumPy files contain a single array. If you want to store several arrays in a single file you may have a look at .npz files with np.savez https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html#numpy.savez I have not seen this in use widely, so you may wish seriously to consider alternatives.

    1. If your arrays are all of the same shape and store related data, you can make a larger array. Say that the current shape is (N_1, N_2) and that you have N_0 such arrays. A loop with

      all_arrays = []
      for npfile in npfiles:
          all_arrays.append(np.load(os.path.join(npyfilespath, npfile)))
      all_arrays = np.array(all_arrays)
      np.save(f_handle, all_array)
      

      will produce a file with a single array of shape (N_0, N_1, N_2)

    2. If you need per-name access to the arrays, HDF5 files are a good match. See http://www.h5py.org/ (a full intro is too much for a SO reply, see the quick start guide http://docs.h5py.org/en/latest/quick.html)
like image 178
Pierre de Buyl Avatar answered Oct 24 '22 00:10

Pierre de Buyl


As discussed in

loading arrays saved using numpy.save in append mode

it is possible to save multiple times to an open file, and it possible to load multiple times. That's not documented, and probably not preferred, but it works. savez archive is the preferred method for saving multiple arrays.

Here's a toy example:

In [777]: with open('multisave.npy','wb') as f:
     ...:     arr = np.arange(10)
     ...:     np.save(f, arr)
     ...:     arr = np.arange(20)
     ...:     np.save(f, arr)
     ...:     arr = np.ones((3,4))
     ...:     np.save(f, arr)
     ...:     
In [778]: ll multisave.npy
-rw-rw-r-- 1 paul 456 Feb 13 08:38 multisave.npy
In [779]: with open('multisave.npy','rb') as f:
     ...:     arr = np.load(f)
     ...:     print(arr)
     ...:     print(np.load(f))
     ...:     print(np.load(f))
     ...:     
[0 1 2 3 4 5 6 7 8 9]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]

Here's a simple example of saving a list of arrays of the same shape

In [780]: traces = [np.arange(10),np.arange(10,20),np.arange(100,110)]
In [781]: traces
Out[781]: 
[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),
 array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])]
In [782]: arr = np.array(traces)
In [783]: arr
Out[783]: 
array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
       [ 10,  11,  12,  13,  14,  15,  16,  17,  18,  19],
       [100, 101, 102, 103, 104, 105, 106, 107, 108, 109]])

In [785]: np.save('mult1.npy', arr)

In [786]: data = np.load('mult1.npy')
In [787]: data
Out[787]: 
array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
       [ 10,  11,  12,  13,  14,  15,  16,  17,  18,  19],
       [100, 101, 102, 103, 104, 105, 106, 107, 108, 109]])
In [788]: list(data)
Out[788]: 
[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),
 array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])]
like image 1
hpaulj Avatar answered Oct 24 '22 01:10

hpaulj