I'm trying to convert a string array of categorical variables to an integer array of categorical variables.
Ex.
import numpy as np a = np.array( ['a', 'b', 'c', 'a', 'b', 'c']) print a.dtype >>> |S1 b = np.unique(a) print b >>> ['a' 'b' 'c'] c = a.desired_function(b) print c, c.dtype >>> [1,2,3,1,2,3] int32
I realize this can be done with a loop but I imagine there is an easier way. Thanks.
To convert numpy float to int array in Python, use the np. astype() function. The np. astype() function takes an array of float values and converts it into an integer array.
We will be using . LabelEncoder() from sklearn library to convert categorical data to numerical data. We will use function fit_transform() in the process.
The elements of a NumPy array, or simply an array, are usually numbers, but can also be boolians, strings, or other objects.
np.unique has some optional returns
return_inverse gives the integer encoding, which I use very often
>>> b, c = np.unique(a, return_inverse=True) >>> b array(['a', 'b', 'c'], dtype='|S1') >>> c array([0, 1, 2, 0, 1, 2]) >>> c+1 array([1, 2, 3, 1, 2, 3])
it can be used to recreate the original array from uniques
>>> b[c] array(['a', 'b', 'c', 'a', 'b', 'c'], dtype='|S1') >>> (b[c] == a).all() True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With