Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy append_field gives shape error for new field with 2d shape

Tags:

python

numpy

I have a structured numpy array, I want to use the recfunctions library http://pyopengl.sourceforge.net/pydoc/numpy.lib.recfunctions.html function append_fields() or rec_append_fields() to append a field with some shape to it. However, I get an error:

ValueError: operands could not be broadcast together with shapes (10) (10,3)

where 10 is the length of my existing array, and (3,) is the shape of the field I want to append.

For example:

import numpy as np
from numpy.lib.recfunctions import append_fields


my_structured_array = np.array(
    zip([0,1,2,3],[[4.3,3.2],[1.4,5.6],[6.,2.5],[4.5,5.4]]),
    dtype=[('id','int8'),('pos','2float16')]
    )
my_new_field = np.ones(
    len(my_structured_array),
    dtype='2int8'
    )
my_appended_array = append_fields(
    my_structured_array,
    'new',
    data=my_new_field
    )

ValueError: operands could not be broadcast together with shapes (4) (4,2)

Any ideas? I tried making my_new_field a list of tuples and putting a dtype argument with the proper shape into the append_fields():

my_new_field = len(my_structured_array)*[(1,1)]

my_appended_array = append_fields(
    my_structured_array,
    'new',
    data=my_new_field,
    dtype='2int8'
    )

but that seems to end up the same once it gets converted to a numpy array.

None of this seems to change when I use rec_append_fields() instead of simply append_fields()

EDIT: In light of the fact that my new field doesn't have the same shape as my array, I suppose that my desired append is impossible, suggested by @radicalbiscuit.

In : my_new_field.shape
Out: (4, 2)

In : my_structured_array.shape
Out: (4,)

But, I included one of the original fields in the array with shape different from the original array to make my point, which is that a field does not have to have the same shape as the structured array. How can I append a field like this?

In : my_structured_array['pos'].shape
Out: (4, 2)

In : my_new_field.shape
Out: (4, 2)

I should note that for my application, I can append an empty field as long as it's possible to somehow change the shape later. Thanks!

like image 611
askewchan Avatar asked Dec 10 '12 05:12

askewchan


2 Answers

append_fields() does indeed require that the two arrays be the same shape. That being said, as you realized in my_structured_array, numpy does support subarrays (that is, a field can itself be an array with a shape).

In your case, I think you probably want my_new_field not to be a two dimensional array, but instead be a one dimensional array (of shape shape(my_structured_array)) with elements of dtype, e.g., dtype([('myfield', '<i8', (2,))]). For example,

import numpy as np
from numpy.lib.recfunctions import append_fields

my_structured_array = np.array(
    zip([0,1,2,3],[[4.3,3.2],[1.4,5.6],[6.,2.5],[4.5,5.4]]),
    dtype=[('id','int8'),('pos','2float16')]
    )

my_new_field = np.ones(
    len(my_structured_array),
    dtype=[('myfield', 'i8', 2)]
    )

my_appended_array = append_fields(
    my_structured_array,
    'new',
    data=my_new_field
    )

Will yield,

>>> my_appended_array[0]
(0, [4.30078125, 3.19921875], ([1, 1],))

Although the datatype is slightly inconvenient as myfield is nested within new,

>>> my_appended_array.dtype
dtype([('id', '|i1'), ('pos', '<f2', (2,)), ('new', [('myfield', '<i8', (2,))])])

This, however, is coerced away fairly easily,

>>> np.asarray(my_appended_array, dtype=[('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])
array([(0, [4.30078125, 3.19921875], [0, 0]),
       (1, [1.400390625, 5.6015625], [0, 0]), (2, [6.0, 2.5], [0, 0]),
       (3, [4.5, 5.3984375], [0, 0])], 
      dtype=[('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])

Still, it's a bit unfortunate that we've had to repeat the dtype of my_structured_array here. While at first glance it appears that numpy.lib.recfunctions.flatten_descr could do the dirty work of flattening the dtype, it unfortunately gives a tuple and not a list as required by np.dtype. Coercing its output to a list, however, works around this issue,

>>> np.dtype(list(np.lib.recfunctions.flatten_descr(my_appended_array.dtype)))
dtype([('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])

This can be passed as the dtype to np.asarray, making things slightly more robust against changes in my_structured_array.dtype.

Indeed, minor inconsistencies such as this make working with record arrays messy business. One gets the feeling that things could fit together a bit more coherently.

Edit: It turns out that the np.lib.recfunctions.merge_arrays function is much more amenable to this sort of merging,

 >>> my_appended_array = merge_arrays([my_structured_array, my_new_field], flatten=True)
 array([(0, [4.30078125, 3.19921875], [1, 1]),
        (1, [1.400390625, 5.6015625], [1, 1]), (2, [6.0, 2.5], [1, 1]),
        (3, [4.5, 5.3984375], [1, 1])], 
       dtype=[('id', '|i1'), ('pos', '<f2', (2,)), ('myfield', '<i8', (2,))])
like image 132
bgamari Avatar answered Nov 13 '22 20:11

bgamari


append_fields() requires that the two arrays are the same shape, which in this case they are not. Printing out the two arrays will help it become obvious:

>>> my_structured_array
array([(0, [4.30078125, 3.19921875]), (1, [1.400390625, 5.6015625]),
       (2, [6.0, 2.5]), (3, [4.5, 5.3984375])], 
      dtype=[('id', '|i1'), ('pos', '<f2', (2,))])
>>> my_new_field
array([[1, 1],
       [1, 1],
       [1, 1],
       [1, 1]], dtype=int8)

As you can see, my_structured_array is an array of length 4 where each element is a tuple containing two objects, an int and a list of two floats.

my_new_field, on the other hand, is an array of length 4 where each element is a list of two ints. It's like trying to add apples and oranges.

Make your arrays the same shape and they'll add together.

like image 29
radicalbiscuit Avatar answered Nov 13 '22 21:11

radicalbiscuit