Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy append: Automatically cast an array of the wrong dimension

is there a way to do the following without an if clause?

I'm reading a set of netcdf files with pupynere and want to build an array with numpy append. Sometimes the input data is multi-dimensional (see variable "a" below), sometimes one dimensional ("b"), but the number of elements in the first dimension is always the same ("9" in the example below).

> import numpy as np
> a = np.arange(27).reshape(3,9)
> b = np.arange(9)
> a.shape
(3, 9)
> b.shape
(9,)

this works as expected:

> np.append(a,a, axis=0)
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
   [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
   [18, 19, 20, 21, 22, 23, 24, 25, 26],
   [ 0,  1,  2,  3,  4,  5,  6,  7,  8],
   [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
   [18, 19, 20, 21, 22, 23, 24, 25, 26]])

but, appending b does not work so elegantly:

> np.append(a,b, axis=0)
ValueError: arrays must have same number of dimensions

The problem with append is (from the numpy manual)

"When axis is specified, values must have the correct shape."

I'd have to cast first in order to get the right result.

> np.append(a,b.reshape(1,9), axis=0)
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
   [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
   [18, 19, 20, 21, 22, 23, 24, 25, 26],
   [ 0,  1,  2,  3,  4,  5,  6,  7,  8]])

So, in my file reading loop, I'm currently using an if clause like this:

for i in [a, b]:
    if np.size(i.shape) == 2:
        result = np.append(result, i, axis=0)
    else:
        result = np.append(result, i.reshape(1,9), axis=0)

Is there a way to append "a" and "b" without the if statement?

EDIT: While @Sven answered the original question perfectly (using np.atleast_2d()), he (and others) pointed out that the code is inefficient. In an answer below, I combined their suggestions and replaces my original code. It should be much more efficient now. Thanks.

like image 480
Sebastian Avatar asked Apr 20 '11 11:04

Sebastian


2 Answers

You can use numpy.atleast_2d():

result = np.append(result, np.atleast_2d(i), axis=0)

That said, note that the repeated use of numpy.append() is a very inefficient way to build a NumPy array -- it has to be reallocated in every step. If at all possible, preallocate the array with the desired final size and populate it afterwards using slicing.

like image 172
Sven Marnach Avatar answered Oct 11 '22 02:10

Sven Marnach


You can just add all of the arrays to a list, then use np.vstack() to concatenate them all together at the end. This avoids constantly reallocating the growing array with every append.

|1> a = np.arange(27).reshape(3,9)

|2> b = np.arange(9)

|3> np.vstack([a,b])
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
       [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23, 24, 25, 26],
       [ 0,  1,  2,  3,  4,  5,  6,  7,  8]])
like image 32
Robert Kern Avatar answered Oct 11 '22 02:10

Robert Kern