Converting a 2D numpy array to a structured array

Tags:

I'm trying to convert a two-dimensional array into a structured array with named fields. I want each row in the 2D array to be a new record in the structured array. Unfortunately, nothing I've tried is working the way I expect.

I'm starting with:

>>> myarray = numpy.array([("Hello",2.5,3),("World",3.6,2)]) >>> print myarray [['Hello' '2.5' '3']  ['World' '3.6' '2']]

I want to convert to something that looks like this:

>>> newarray = numpy.array([("Hello",2.5,3),("World",3.6,2)], dtype=[("Col1","S8"),("Col2","f8"),("Col3","i8")]) >>> print newarray [('Hello', 2.5, 3L) ('World', 3.6000000000000001, 2L)]

What I've tried:

>>> newarray = myarray.astype([("Col1","S8"),("Col2","f8"),("Col3","i8")]) >>> print newarray [[('Hello', 0.0, 0L) ('2.5', 0.0, 0L) ('3', 0.0, 0L)]  [('World', 0.0, 0L) ('3.6', 0.0, 0L) ('2', 0.0, 0L)]]  >>> newarray = numpy.array(myarray, dtype=[("Col1","S8"),("Col2","f8"),("Col3","i8")]) >>> print newarray [[('Hello', 0.0, 0L) ('2.5', 0.0, 0L) ('3', 0.0, 0L)]  [('World', 0.0, 0L) ('3.6', 0.0, 0L) ('2', 0.0, 0L)]]

Both of these approaches attempt to convert each entry in myarray into a record with the given dtype, so the extra zeros are inserted. I can't figure out how to get it to convert each row into a record.

Another attempt:

>>> newarray = myarray.copy() >>> newarray.dtype = [("Col1","S8"),("Col2","f8"),("Col3","i8")] >>> print newarray [[('Hello', 1.7219343871178711e-317, 51L)]  [('World', 1.7543139673493688e-317, 50L)]]

This time no actual conversion is performed. The existing data in memory is just re-interpreted as the new data type.

The array that I'm starting with is being read in from a text file. The data types are not known ahead of time, so I can't set the dtype at the time of creation. I need a high-performance and elegant solution that will work well for general cases since I will be doing this type of conversion many, many times for a large variety of applications.

Thanks!

611

asked Sep 01 '10 23:09

Emma

1 Answers

You can "create a record array from a (flat) list of arrays" using numpy.core.records.fromarrays as follows:

>>> import numpy as np >>> myarray = np.array([("Hello",2.5,3),("World",3.6,2)]) >>> print myarray [['Hello' '2.5' '3']  ['World' '3.6' '2']]   >>> newrecarray = np.core.records.fromarrays(myarray.transpose(),                                               names='col1, col2, col3',                                              formats = 'S8, f8, i8')  >>> print newrecarray [('Hello', 2.5, 3) ('World', 3.5999999046325684, 2)]

I was trying to do something similar. I found that when numpy created a structured array from an existing 2D array (using np.core.records.fromarrays), it considered each column (instead of each row) in the 2-D array as a record. So you have to transpose it. This behavior of numpy does not seem very intuitive, but perhaps there is a good reason for it.

answered Sep 23 '22 07:09

Curious2learn

Related questions
                            
                                What is the difference between C.UTF-8 and en_US.UTF-8 locales?
                            
                                In pdb how do you reset the list (l) command line count?
                            
                                Pointers and arrays in Python ctypes
                            
                                What's the best way to sum all values in a Pandas dataframe?
                            
                                Why is scikit-learn SVM.SVC() extremely slow?
                            
                                Delete file from zipfile with the ZipFile Module
                            
                                Difference between scipy.spatial.KDTree and scipy.spatial.cKDTree
                            
                                Define an order for ManyToManyField with Django
                            
                                Can subprocess.call be invoked without waiting for process to finish?
                            
                                Tensorflow variable scope: reuse if variable exists
                            
                                Pythonic way to create a numpy array from a list of numpy arrays
                            
                                How do you pass a Queue reference to a function managed by pool.map_async()?
                            
                                How can I detect Heroku's environment?
                            
                                sqlalchemy simple example of `sum`, `average`, `min`, `max`
                            
                                sqlalchemy foreign key relationship attributes
                            
                                Nested Json to pandas DataFrame with specific format
                            
                                Convolve2d just by using Numpy
                            
                                How does COPY work and why is it so much faster than INSERT?
                            
                                Python urllib2 with keep alive
                            
                                Invoking Pylint programmatically

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Converting a 2D numpy array to a structured array

Tags:

python

numpy

Emma

People also ask

1 Answers

Curious2learn

Recent Activity

Donate For Us