Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LabelEncoder: How to keep a dictionary that shows original and converted variable

When using LabelEncoder to encode categorical variables into numerics,

how does one keep a dictionary in which the transformation is tracked?

i.e. a dictionary in which I can see which values became what:

{'A':1,'B':2,'C':3}
like image 591
ishido Avatar asked Jan 06 '16 08:01

ishido


People also ask

What does preprocessing LabelEncoder () do?

LabelEncoder can be used to normalize labels. It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels. Fit label encoder. Fit label encoder and return encoded labels.

What is the difference between Onehotencoder and LabelEncoder?

As you can see, we have three new columns with 1s and 0s, depending on the country that the rows represent. So, that's the difference between Label Encoding and One Hot Encoding.

How do you reverse a label encoding in Python?

To reverse the process of LabelEncoder , it has a function provided specifically for the task called inverse_transform.


1 Answers

I created a dictionary from classes_

le = preprocessing.LabelEncoder()
ids = le.fit_transform(labels)
mapping = dict(zip(le.classes_, range(len(le.classes_))))

to test:

all([mapping[x] for x in le.inverse_transform(ids)] == ids)

should return True.

This works because fit_transform uses numpy.unique to simultaneously calculate the label encoding and the classes_ attribute:

def fit_transform(self, y):
    self.classes_, y = np.unique(y, return_inverse=True)
    return y
like image 178
James King Avatar answered Sep 25 '22 01:09

James King