Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Store different datatypes in one NumPy array?

I have two different arrays, one with strings and another with ints. I want to concatenate them, into one array where each column has the original datatype. My current solution for doing this (see below) converts the entire array into dtype = string, which seems very memory inefficient.

combined_array = np.concatenate((A, B), axis = 1)

Is it possible to mutiple dtypes in combined_array when A.dtype = string and B.dtype = int?

like image 378
veor Avatar asked Jul 03 '12 11:07

veor


People also ask

Can array store different types of data in Python?

No, we cannot store multiple datatype in an Array, we can store similar datatype only in an Array.

How do you store multiple data types in an array?

Yes we can store different/mixed types in a single array by using following two methods: Method 1: using Object array because all types in . net inherit from object type Ex: object[] array=new object[2];array[0]=102;array[1]="csharp";Method 2: Alternatively we can use ArrayList class present in System.

How many different data types can be present in a single array in NumPy?

There are 5 basic numerical types representing booleans (bool), integers (int), unsigned integers (uint) floating point (float) and complex.

Can NumPy array store heterogeneous data?

NumPy arrays are typed arrays of fixed size. Python lists are heterogeneous and thus elements of a list may contain any object type, while NumPy arrays are homogenous and can contain object of only one type.


2 Answers

One approach might be to use a record array. The "columns" won't be like the columns of standard numpy arrays, but for most use cases, this is sufficient:

>>> a = numpy.array(['a', 'b', 'c', 'd', 'e']) >>> b = numpy.arange(5) >>> records = numpy.rec.fromarrays((a, b), names=('keys', 'data')) >>> records rec.array([('a', 0), ('b', 1), ('c', 2), ('d', 3), ('e', 4)],        dtype=[('keys', '|S1'), ('data', '<i8')]) >>> records['keys'] rec.array(['a', 'b', 'c', 'd', 'e'],        dtype='|S1') >>> records['data'] array([0, 1, 2, 3, 4]) 

Note that you can also do something similar with a standard array by specifying the datatype of the array. This is known as a "structured array":

>>> arr = numpy.array([('a', 0), ('b', 1)],                        dtype=([('keys', '|S1'), ('data', 'i8')])) >>> arr array([('a', 0), ('b', 1)],        dtype=[('keys', '|S1'), ('data', '<i8')]) 

The difference is that record arrays also allow attribute access to individual data fields. Standard structured arrays do not.

>>> records.keys chararray(['a', 'b', 'c', 'd', 'e'],        dtype='|S1') >>> arr.keys Traceback (most recent call last):   File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'keys' 
like image 84
senderle Avatar answered Sep 20 '22 17:09

senderle


A simple solution: convert your data to object 'O' type

z = np.zeros((2,2), dtype='U2') o = np.ones((2,1), dtype='O') np.hstack([o, z]) 

creates the array:

array([[1, '', ''],        [1, '', '']], dtype=object) 
like image 28
codeMonkey Avatar answered Sep 20 '22 17:09

codeMonkey