Numpy : convert labels into indexes

Tags:

Is it possible to convert a string vector into an indexed one using numpy ?

Suppose I have an array of strings like ['ABC', 'DEF', 'GHI', 'DEF', 'ABC'] etc. I want it to be changed to an array of integers like [0,1,2,1,0]. Is it possible using numpy? I know that Pandas has a Series class that can do this, courtesy of this answer. Is there something similar for numpy as well?

Edit : np.unique() returns unique value for all elements. What I'm trying to do is convert the labels in the Iris dataset to indices, such as 0 for Iris-setosa, 1 for Iris-versicolor and 2 for Iris-virginica respectively. Is there a way to do this using numpy?

867

asked May 02 '18 07:05

srdg

1 Answers

Use numpy.unique with parameter return_inverse=True, but there is difference with handling NaNs - check factorizing values:

L = ['ABC', 'DEF', 'GHI', 'DEF', 'ABC']

print (np.unique(L, return_inverse=True)[1])
[0 1 2 1 0]

pandas factorize working nice with list or array too:

print (pd.factorize(L)[0])
[0 1 2 1 0]

118

answered Nov 15 '22 00:11

jezrael

Related questions
                            
                                Visual studio: Python virtual environments in source control
                            
                                How to create a function for recursively generating iterating functions
                            
                                How to print the gradients during training in Tensorflow?
                            
                                Error: "Driver not loaded" in PyQt5
                            
                                Generate N-Grams from strings with pandas
                            
                                Pandas convert Column to time
                            
                                Group by and find sum for groups but return NaN as NaN, not 0
                            
                                python - order of import for modules
                            
                                tensors are from different graphs
                            
                                how to make classes with __getattr__ pickable
                            
                                Send file through Django Rest
                            
                                Pandas drop first columns after csv read
                            
                                Unable to make a split screen scroll to the bottom
                            
                                List Comprehension for Strings
                            
                                I get the error _tkinter.TclError: bad window path name ".!button" and i'm not sure why
                            
                                Running daphne behind nginx reverse proxy with protocol upgrade always routes to http instead of websocket
                            
                                Line doesn't show over barplot
                            
                                AttributeError: module 'numpy' has no attribute 'matlib' [duplicate]
                            
                                Can we use tf.spectral fourier functions in keras?
                            
                                Convert columns into multiple rows in pandas dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numpy : convert labels into indexes

Tags:

python

pandas

numpy

classification

data-science

srdg

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us