I have a pandas dataframe and I'm trying to change the values in a given column which are represented by strings into integers. For instance:
df = index fruit quantity price
0 apple 5 0.99
1 apple 2 0.99
2 orange 4 0.89
4 banana 1 1.64
...
10023 kiwi 10 0.92
I would like it to look at:
df = index fruit quantity price
0 1 5 0.99
1 1 2 0.99
2 2 4 0.89
4 3 1 1.64
...
10023 5 10 0.92
I can do this using
df["fruit"] = df["fruit"].map({"apple": 1, "orange": 2,...})
which works if I have a small list to change, but I'm looking at a column with over 500 different labels. Is there any way of changing this from a string
to a an int
?
You can use sklearn.preprocessing
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
le.fit(df.fruit)
df['categorical_label'] = le.transform(df.fruit)
Transform labels back to original encoding.
le.inverse_transform(df['categorical_label'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With