I have a list, my_list
, with mixed data types that I want to convert into a numpy array. However, I get the error TypeError: expected a readable buffer object
. See code below. I've tried to base my code on the NumPy documentation.
my_list = [['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5, 6, 0, 1.0], ['User_0', '2012-3', 0, 0, 4, 1.0]]
my_np_array = np.array(my_list, dtype='S30, S8, i4, i4, f32')
Having a data type (dtype) is one of the key features that distinguishes NumPy arrays from lists. In lists, the types of elements can be mixed.
While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous.
Yes, if you use numpy structured arrays, each element of the array would be a "structure", and the fields of the structure can have different datatypes.
It means: 'O' (Python) objects. Source. The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised.
Why don't use dtype=object
?
In [1]: my_list = [['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5,
6, 0, 1.0], ['User_0', '2012-3', 0, 0, 4, 1.0]]
In [2]: my_np_array = np.array(my_list, dtype=object)
In [3]: my_np_array
Out[3]:
array([['User_0', '2012-2', 1, 6, 0, 1.0],
['User_0', '2012-2', 5, 6, 0, 1.0],
['User_0', '2012-3', 0, 0, 4, 1.0]], dtype=object)
Note
It's about memory usage, when you specify the dtype of each column, memory allocated to your ndarray
will be less than when you use dtype=object
which contain all possible type in python so the memory allocated for each column will be maximal.
Your nested items should be tuple
also you omitted one i4
in your types :
>>> my_np_array = np.array(map(tuple,my_list), dtype='|S30, |S8, i4, i4, i4, f32')
>>> my_np_array
array([('User_0', '2012-2', 1, 6, 0, 1.0),
('User_0', '2012-2', 5, 6, 0, 1.0),
('User_0', '2012-3', 0, 0, 4, 1.0)],
dtype=[('f0', 'S30'), ('f1', 'S8'), ('f2', '<i4'), ('f3', '<i4'), ('f4', '<i4'), ('f5', '<f4')])
As far as is know since numpy use tuples to preserve its types when you used multiple type for array items you need to convert your sub arrays to tuple like dtype
elements.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With