I am converting strings to categorical values in my dataset using the following piece of code.
data['weekday'] = pd.Categorical.from_array(data.weekday).labels
For eg,
index weekday 0 Sunday 1 Sunday 2 Wednesday 3 Monday 4 Monday 5 Thursday 6 Tuesday
After encoding the weekday, my dataset appears like this:
index weekday 0 3 1 3 2 6 3 1 4 1 5 4 6 5
Is there any way I can know that Sunday has been mapped to 3, Wednesday to 6 and so on?
Label Encoder: Label Encoding in Python can be achieved using Sklearn Library. Sklearn provides a very efficient tool for encoding the levels of categorical features into numeric values. LabelEncoder encode labels with a value between 0 and n_classes-1 where n is the number of distinct labels.
An ordinal encoding involves mapping each unique label to an integer value. This type of encoding is really only appropriate if there is a known relationship between the categories. This relationship does exist for some of the variables in our dataset, and ideally, this should be harnessed when preparing the data.
Label Encoding is a popular encoding technique for handling categorical variables. In this technique, each label is assigned a unique integer based on alphabetical ordering.
You can create additional dictionary with mapping:
from sklearn import preprocessing le = preprocessing.LabelEncoder() le.fit(data['name']) le_name_mapping = dict(zip(le.classes_, le.transform(le.classes_))) print(le_name_mapping) {'Tom': 0, 'Nick': 1, 'Kate': 2}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With