Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I check what value is assigned to what label while using sklearns' LabelEncoder()?

I am transforming categorical data to numeric values for machine learning purposes.

To give an example, the buying price (= "buying" variable) of a car is categorized in: "vhigh, high, med, low". To transform it into numeric values, I used:

le = preprocessing.LabelEncoder()
buying = le.fit_transform(list(data["buying"]))

Is there a way to check how exactly Python transformed each of those labels into numeric value since this is done randomly (e.g. vhigh = 0, high = 2)?

like image 368
Viol1997 Avatar asked Nov 21 '25 20:11

Viol1997


1 Answers

You can create an extra column in your dataframe to map the values:

mapping_df = data[['buying']].copy() #Create an extra dataframe which will be used to address only the encoded values
mapping_df['buying_encoded'] = le.fit_transform(data['buying'].values) #Using values is faster than using list

Here's a full working example:

import pandas as pd
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
data = pd.DataFrame({'index':[0,1,2,3,4,5,6],
        'buying':['Luffy','Nami','Luffy','Franky','Sanji','Zoro','Luffy']})
data['buying_encoded'] = le.fit_transform(data['buying'].values)
data = data.drop_duplicates('buying').set_index('index')
print(data)

Output:

       buying  buying_encoded
index                        
0       Luffy               1
1        Nami               2
3      Franky               0
4       Sanji               3
5        Zoro               4
like image 182
Celius Stingher Avatar answered Nov 24 '25 09:11

Celius Stingher



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!