What's the best way to convert numpy's <code>recarray</code> to a normal array? i could do a <code>.tolist()</code> first and then do an <code>array()</code> again, but that seems somewhat inefficient.. Example: <pre class="prettyprint"><code>import numpy as np a = np.recarray((2,), dtype=[('x', int), ('y', float), ('z', int)]) >>> a rec.array([(30408891, 9.2944097561804909e-296, 30261980), (44512448, 4.5273310988985789e-300, 29979040)], dtype=[('x', '<i4'), ('y', '<f8'), ('z', '<i4')]) >>> np.array(a.tolist()) array([[ 3.04088910e+007, 9.29440976e-296, 3.02619800e+007], [ 4.45124480e+007, 4.52733110e-300, 2.99790400e+007]]) </code></pre>

By "normal array" I take it you mean a NumPy array of homogeneous dtype. Given a recarray, such as: <pre class="prettyprint"><code>>>> a = np.array([(0, 1, 2), (3, 4, 5)],[('x', int), ('y', float), ('z', int)]).view(np.recarray) rec.array([(0, 1.0, 2), (3, 4.0, 5)], dtype=[('x', '<i4'), ('y', '<f8'), ('z', '<i4')]) </code></pre> we must first make each column have the same dtype. We can then convert it to a "normal array" by viewing the data by the same dtype: <pre class="prettyprint"><code>>>> a.astype([('x', '<f8'), ('y', '<f8'), ('z', '<f8')]).view('<f8') array([ 0., 1., 2., 3., 4., 5.]) </code></pre> <hr> astype returns a new numpy array. So the above requires additional memory in an amount proportional to the size of <code>a</code>. Each row of <code>a</code> requires 4+8+4=16 bytes, while <code>a.astype(...)</code> requires 8*3=24 bytes. Calling view requires no new memory, since <code>view</code> just changes how the underlying data is interpreted. <code>a.tolist()</code> returns a new Python list. Each Python number is an object which requires more bytes than its equivalent representation in a numpy array. So <code>a.tolist()</code> requires more memory than <code>a.astype(...)</code>. Calling <code>a.astype(...).view(...)</code> is also faster than <code>np.array(a.tolist())</code>: <pre class="prettyprint"><code>In [8]: a = np.array(zip(*[iter(xrange(300))]*3),[('x', int), ('y', float), ('z', int)]).view(np.recarray) In [9]: %timeit a.astype([('x', '<f8'), ('y', '<f8'), ('z', '<f8')]).view('<f8') 10000 loops, best of 3: 165 us per loop In [10]: %timeit np.array(a.tolist()) 1000 loops, best of 3: 683 us per loop </code></pre>

Here is a relatively clean solution using <code>pandas</code>: <pre class="prettyprint"><code>>>> import numpy as np >>> import pandas as pd >>> a = np.recarray((2,), dtype=[('x', int), ('y', float), ('z', int)]) >>> arr = pd.DataFrame(a).to_numpy() >>> arr array([[9.38925058e+013, 0.00000000e+000, 1.40380704e+014], [1.40380704e+014, 6.93572751e-310, 1.40380484e+014]]) >>> arr.shape (2, 3) >>> arr.dtype dtype('float64') </code></pre> First the data from the <code>recarray</code> are loaded into a <code>pd.DataFrame</code>, then the data are exported using the <code>DataFrame.to_numpy</code> method. As we can see, this method call has automatically converted all of the data to type <code>float64</code>.

How to convert numpy.recarray to numpy.array?

Tags:

What's the best way to convert numpy's recarray to a normal array?

i could do a .tolist() first and then do an array() again, but that seems somewhat inefficient..

Example:

import numpy as np
a = np.recarray((2,), dtype=[('x', int), ('y', float), ('z', int)])

>>> a
  rec.array([(30408891, 9.2944097561804909e-296, 30261980),
   (44512448, 4.5273310988985789e-300, 29979040)], 
  dtype=[('x', '<i4'), ('y', '<f8'), ('z', '<i4')])

>>> np.array(a.tolist())
   array([[  3.04088910e+007,   9.29440976e-296,   3.02619800e+007],
   [  4.45124480e+007,   4.52733110e-300,   2.99790400e+007]])

360

asked Oct 20 '11 20:10

Muppet

2 Answers

By "normal array" I take it you mean a NumPy array of homogeneous dtype. Given a recarray, such as:

>>> a = np.array([(0, 1, 2),
              (3, 4, 5)],[('x', int), ('y', float), ('z', int)]).view(np.recarray)
rec.array([(0, 1.0, 2), (3, 4.0, 5)], 
      dtype=[('x', '<i4'), ('y', '<f8'), ('z', '<i4')])

we must first make each column have the same dtype. We can then convert it to a "normal array" by viewing the data by the same dtype:

>>> a.astype([('x', '<f8'), ('y', '<f8'), ('z', '<f8')]).view('<f8')
array([ 0.,  1.,  2.,  3.,  4.,  5.])

astype returns a new numpy array. So the above requires additional memory in an amount proportional to the size of a. Each row of a requires 4+8+4=16 bytes, while a.astype(...) requires 8*3=24 bytes. Calling view requires no new memory, since view just changes how the underlying data is interpreted.

a.tolist() returns a new Python list. Each Python number is an object which requires more bytes than its equivalent representation in a numpy array. So a.tolist() requires more memory than a.astype(...).

Calling a.astype(...).view(...) is also faster than np.array(a.tolist()):

In [8]: a = np.array(zip(*[iter(xrange(300))]*3),[('x', int), ('y', float), ('z', int)]).view(np.recarray)

In [9]: %timeit a.astype([('x', '<f8'), ('y', '<f8'), ('z', '<f8')]).view('<f8')
10000 loops, best of 3: 165 us per loop

In [10]: %timeit np.array(a.tolist())
1000 loops, best of 3: 683 us per loop

158

answered Oct 24 '22 14:10

unutbu

Here is a relatively clean solution using pandas:

>>> import numpy as np
>>> import pandas as pd
>>> a = np.recarray((2,), dtype=[('x', int), ('y', float), ('z', int)])
>>> arr = pd.DataFrame(a).to_numpy()
>>> arr
array([[9.38925058e+013, 0.00000000e+000, 1.40380704e+014],
       [1.40380704e+014, 6.93572751e-310, 1.40380484e+014]])
>>> arr.shape
(2, 3)
>>> arr.dtype
dtype('float64')

First the data from the recarray are loaded into a pd.DataFrame, then the data are exported using the DataFrame.to_numpy method. As we can see, this method call has automatically converted all of the data to type float64.

answered Oct 24 '22 12:10

Jasha

Related questions
                            
                                Is there an enum with MIME Types in Java? [duplicate]
                            
                                what's the best way to write a pluginable application?
                            
                                Permanent 'Temporary failure in name resolution' after running for a number of hours
                            
                                Does I<D> re-implement I<B> if I<D> is convertible to I<B> by variance conversion?
                            
                                Convert list to array. java.lang.ArrayStoreException
                            
                                Cost of calling a function or not in Javascript
                            
                                Database Design: Multiple tables vs a single table
                            
                                Does JSON have a schema?
                            
                                How to encode a path that contains a hash?
                            
                                documenting dataset with roxygen2
                            
                                ConcurrentHashMap put vs putIfAbsent
                            
                                Weird rendering bug in desktop webkit (safari/chrome) with absolutely positioned elements

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With