python: numpy: concatenation of named arrays

Tags:

numpy

Consider the following simple example:

x = numpy.array([(1,2),(3,4)],dtype=[('a','<f4'),('b','<f4')])
y = numpy.array([(1,2),(3,4)],dtype=[('c','<f4'),('d','<f4')])
numpy.hstack((x,y))

One will get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33\lib\site-packages\numpy\core\shape_base.py", line 226, in vstack
    return _nx.concatenate(list(map(atleast_2d,tup)),0)
TypeError: invalid type promotion

If array had not titles it works

x = numpy.array([(1,2),(3,4)],dtype='<f4')
y = numpy.array([(1,2),(3,4)],dtype='<f4')
numpy.hstack((x,y))

If I remove the names from x and y it works too.

Question: how to concatenate, vstack or hstack of titled numpy array ? Note: numpy.lib.recfunctions.stack_arrays doesn't work well either

943

asked Sep 02 '13 13:09

1 Answers

The problem is that the types are different. The "title" is part of the type, and y uses different names from x, so the types are incompatible. If you use compatible types, everything works fine:

>>> x = numpy.array([(1, 2), (3, 4)], dtype=[('a', '<f4'), ('b', '<f4')])
>>> y = numpy.array([(5, 6), (7, 8)], dtype=[('a', '<f4'), ('b', '<f4')])
>>> numpy.vstack((x, y))
array([[(1.0, 2.0), (3.0, 4.0)],
       [(5.0, 6.0), (7.0, 8.0)]], 
      dtype=[('a', '<f4'), ('b', '<f4')])
>>> numpy.hstack((x, y))
array([(1.0, 2.0), (3.0, 4.0), (5.0, 6.0), (7.0, 8.0)], 
      dtype=[('a', '<f4'), ('b', '<f4')])
>>> numpy.dstack((x, y))
array([[[(1.0, 2.0), (5.0, 6.0)],
        [(3.0, 4.0), (7.0, 8.0)]]], 
      dtype=[('a', '<f4'), ('b', '<f4')])

Sometimes dstack, etc. are smart enough to coerce types in a sensible way, but numpy has no way to know how to combine record arrays with different user-defined field names.

If you want to concatenate the datatypes, then you have to create a new datatype. Don't make the mistake of thinking that the sequence of names (x['a'], x['b']...) constitutes a true dimension of the array; x and y above are 1-d arrays of blocks of memory, each of which contains two 32-bit floats that can be accessed using the names 'a' and 'b'. But as you can see, if you access an individual item in the array, you don't get another array as you would if it were truly a second dimension. You can see the difference here:

>>> x = numpy.array([(1, 2), (3, 4)], dtype=[('a', '<f4'), ('b', '<f4')])
>>> x[0]
(1.0, 2.0)
>>> type(x[0])
<type 'numpy.void'>

>>> z = numpy.array([(1, 2), (3, 4)])
>>> z[0]
array([1, 2])
>>> type(z[0])
<type 'numpy.ndarray'>

This is what allows record arrays to contain heterogenous data; record arrays can contain both strings and ints, but the trade-off is that you don't get the full power of an ndarray at the level of individual records.

The upshot is that to join individual blocks of memory, you actually have to modify the dtype of the array. There are a few ways to do this but the simplest I could find involves the little-known numpy.lib.recfunctions library (which I see you've already found!):

>>> numpy.lib.recfunctions.rec_append_fields(x, 
                                             y.dtype.names, 
                                             [y[n] for n in y.dtype.names])
rec.array([(1.0, 2.0, 1.0, 2.0), (3.0, 4.0, 3.0, 4.0)], 
      dtype=[('a', '<f4'), ('b', '<f4'), ('c', '<f4'), ('d', '<f4')])

answered Oct 08 '22 03:10

senderle

Related questions
                            
                                [py.test]: test dependencies
                            
                                GitPython equivalent of "git remote show origin"?
                            
                                Simplifying logging in Flask
                            
                                Use of re.MULTILINE and re.DOTALL together python
                            
                                Sphinx documentation processor extension works differently for HTML and LaTeX output?
                            
                                How to find the containing class of a decorated method in Python
                            
                                Pre-signed URLs and x-amz-acl
                            
                                How to create a virtualenv by cloning the current local environment?
                            
                                Block mean of numpy 2D array
                            
                                Hide / Invisible Matplotlib figure
                            
                                How to install npm package from python script?
                            
                                Printed length of a string in python
                            
                                Understanding Python fork and memory allocation errors
                            
                                Is there a way to suppress unresolved imports in eclipse in a PyDev project?
                            
                                How to create a python package with multiple files without subpackages
                            
                                What python 3 library should I use for MySQL?
                            
                                Django Wizard, multiple forms in one step
                            
                                BeautifulSoup not extracting all html (automatically deleting much of a page's html)
                            
                                Share SciPy Sparse Array Between Process Objects
                            
                                What tools should I use to profile Python code on window 7

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python: numpy: concatenation of named arrays

Tags:

python

numpy

Hanan Shteingart

People also ask

1 Answers

senderle

Recent Activity

Donate For Us