Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy convert categorical string arrays to an integer array

Tags:

I'm trying to convert a string array of categorical variables to an integer array of categorical variables.

Ex.

import numpy as np a = np.array( ['a', 'b', 'c', 'a', 'b', 'c']) print a.dtype >>> |S1  b = np.unique(a) print b >>>  ['a' 'b' 'c']  c = a.desired_function(b) print c, c.dtype >>> [1,2,3,1,2,3] int32 

I realize this can be done with a loop but I imagine there is an easier way. Thanks.

like image 457
wroscoe Avatar asked Jul 03 '10 18:07

wroscoe


People also ask

How do I convert a NumPy array to integer?

To convert numpy float to int array in Python, use the np. astype() function. The np. astype() function takes an array of float values and converts it into an integer array.

How do you convert categorical to numerical in Python?

We will be using . LabelEncoder() from sklearn library to convert categorical data to numerical data. We will use function fit_transform() in the process.

Can NumPy work with arrays of strings?

The elements of a NumPy array, or simply an array, are usually numbers, but can also be boolians, strings, or other objects.


1 Answers

np.unique has some optional returns

return_inverse gives the integer encoding, which I use very often

>>> b, c = np.unique(a, return_inverse=True) >>> b array(['a', 'b', 'c'],        dtype='|S1') >>> c array([0, 1, 2, 0, 1, 2]) >>> c+1 array([1, 2, 3, 1, 2, 3]) 

it can be used to recreate the original array from uniques

>>> b[c] array(['a', 'b', 'c', 'a', 'b', 'c'],        dtype='|S1') >>> (b[c] == a).all() True 
like image 112
Josef Avatar answered Sep 21 '22 17:09

Josef