Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy Stacking 1D arrays into structured array

Tags:

python

numpy

I'm running Numpy 1.6 in Python 2.7, and have some 1D arrays I'm getting from another module. I would like to take these arrays and pack them into a structured array so I can index the original 1D arrays by name. I am having trouble figuring out how to get the 1D arrays into a 2D array and make the dtype access the right data. My MWE is as follows:

>>> import numpy as np
>>> 
>>> x = np.random.randint(10,size=3)
>>> y = np.random.randint(10,size=3)
>>> z = np.random.randint(10,size=3)
>>> x
array([9, 4, 7])
>>> y
array([5, 8, 0])
>>> z
array([2, 3, 6])
>>> 
>>> w = np.array([x,y,z])
>>> w.dtype=[('x','i4'),('y','i4'),('z','i4')]
>>> w
array([[(9, 4, 7)],
       [(5, 8, 0)],
       [(2, 3, 6)]], 
      dtype=[('x', '<i4'), ('y', '<i4'), ('z', '<i4')])
>>> w['x']
array([[9],
       [5],
       [2]])
>>> 
>>> u = np.vstack((x,y,z))
>>> u.dtype=[('x','i4'),('y','i4'),('z','i4')]
>>> u
array([[(9, 4, 7)],
       [(5, 8, 0)],
       [(2, 3, 6)]],    
      dtype=[('x', '<i4'), ('y', '<i4'), ('z', '<i4')]) 

>>> u['x']
array([[9],
       [5],
       [2]])

>>> v = np.column_stack((x,y,z))
>>> v
array([[(9, 4, 7), (5, 8, 0), (2, 3, 6)]], 
      dtype=[('x', '<i4'), ('y', '<i4'), ('z', '<i4')])

>>> v.dtype=[('x','i4'),('y','i4'),('z','i4')]
>>> v['x']
array([[9, 5, 2]])

As you can see, while my original x array contains [9,4,7], no way I've attempted to stack the arrays and then index by 'x' returns the original x array. Is there a way to do this, or am I coming at it wrong?

like image 394
Thav Avatar asked Jul 03 '13 19:07

Thav


3 Answers

One way to go is

wtype=np.dtype([('x',x.dtype),('y',y.dtype),('z',z.dtype)])
w=np.empty(len(x),dtype=wtype)
w['x']=x
w['y']=y
w['z']=z

Notice that the size of each number returned by randint depends on your platform, so instead of an int32, i.e. 'i4', on my machine I have an int64 which is 'i8'. This other way is more portable.

like image 109
gg349 Avatar answered Nov 15 '22 14:11

gg349


You want to use np.column_stack:

import numpy as np

x = np.random.randint(10,size=3)
y = np.random.randint(10,size=3)
z = np.random.randint(10,size=3)

w = np.column_stack((x, y, z))
w = w.ravel().view([('x', x.dtype), ('y', y.dtype), ('z', z.dtype)])

>>> w
array([(5, 1, 8), (8, 4, 9), (4, 2, 6)], 
      dtype=[('x', '<i4'), ('y', '<i4'), ('z', '<i4')])
>>> x
array([5, 8, 4])
>>> y
array([1, 4, 2])
>>> z
array([8, 9, 6])
>>> w['x']
array([5, 8, 4])
>>> w['y']
array([1, 4, 2])
>>> w['z']
array([8, 9, 6])
like image 39
Jaime Avatar answered Nov 15 '22 14:11

Jaime


To build on top of the chosen answer, you can make this process dynamic:

  • You first loop over your arrays (which can be single columns)
  • Then you loop over your columns to get the datatypes
  • You create the empty array using those datatypes
  • Then we repeat those loops to populate the array

SETUP

# First, let's build a structured array
rows = [
    ("A", 1),
    ("B", 2),
    ("C", 3),
]
dtype = [
    ("letter", str, 1),
    ("number", int, 1),
]
arr = np.array(rows, dtype=dtype)

# Then, let's create a standalone column, of the same length:
rows = [
    1.0,
    2.0,
    3.0,
]
dtype = [
    ("float", float, 1)
]
new_col = np.array(rows, dtype=dtype)

SOLVING THE PROBLEM

# Now, we dynamically create an empty array with the dtypes from our structured array and our new column:
dtypes = []
for array in [arr, new_col]:
    for name in array.dtype.names:
        dtype = (name, array[name].dtype)
        dtypes.append(dtype)
new_arr = np.empty(len(new_col), dtype=dtypes)

# Finally, put your data in the empty array:
for array in [arr, new_col]:
    for name in array.dtype.names:
        new_arr[name] = array[name]

Hope it helps

like image 25
Jordan Kowal Avatar answered Nov 15 '22 14:11

Jordan Kowal