Is it possible to convert a string vector into an indexed one using numpy
?
Suppose I have an array of strings like ['ABC', 'DEF', 'GHI', 'DEF', 'ABC']
etc. I want it to be changed to an array of integers like [0,1,2,1,0]
. Is it possible using numpy? I know that Pandas
has a Series
class that can do this, courtesy of this answer. Is there something similar for numpy
as well?
Edit :
np.unique()
returns unique value for all elements. What I'm trying to do is convert the labels in the Iris dataset to indices, such as 0 for Iris-setosa
, 1 for Iris-versicolor
and 2 for Iris-virginica
respectively. Is there a way to do this using numpy
?
We can get the indices of the sorted elements of a given array with the help of argsort() method. This function is used to perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as arr that that would sort the array.
ndarrays can be indexed using the standard Python x[obj] syntax, where x is the array and obj the selection. There are different kinds of indexing available depending on obj: basic indexing, advanced indexing and field access.
The numpy. argmax() function returns indices of the max element of the array in a particular axis.
numpy. unravel_index(indices, shape, order='C') Converts a flat index or array of flat indices into a tuple of coordinate arrays. Parameters indicesarray_like. An integer array whose elements are indices into the flattened version of an array of dimensions shape .
Use numpy.unique
with parameter return_inverse=True
, but there is difference with handling NaN
s - check factorizing values:
L = ['ABC', 'DEF', 'GHI', 'DEF', 'ABC']
print (np.unique(L, return_inverse=True)[1])
[0 1 2 1 0]
pandas factorize
working nice with list or array too:
print (pd.factorize(L)[0])
[0 1 2 1 0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With