I have a data frame (df
) with the following:
var1
a 1
a 1
b 2
b 3
c 3
d 5
And a dictionary:
dict_cat = {
'x' = ['a', 'b', 'c'],
'y' = 'd' }
And I want to create a new column called cat
in which depending of the var1
value, it takes the dict key value:
var1 cat
a 1 x
a 1 x
b 2 x
b 3 x
c 3 x
d 5 y
I have tried to map
the dict to the variable using: df['cat'] = df['var1'].map(dict_cat)
, but since values are inside a list, Python do not recognize the values and I only get NaN
values. There is a way to do this using map
, or should I create a function that iterates over rows checking if var1
is in
the dictionary lists?
Thanks!
You need swap keys with values to new dict
and then use map
:
print (df)
var1 var2
0 a 1
1 a 1
2 b 2
3 b 3
4 c 3
5 d 5
dict_cat = {'x' : ['a', 'b', 'c'],'y' : 'd' }
d = {k: oldk for oldk, oldv in dict_cat.items() for k in oldv}
print (d)
{'a': 'x', 'b': 'x', 'c': 'x', 'd': 'y'}
df['cat'] = df['var1'].map(d)
print (df)
var1 var2 cat
0 a 1 x
1 a 1 x
2 b 2 x
3 b 3 x
4 c 3 x
5 d 5 y
If first columns is index is possible use rename
or convert index
to_series
and then use map
:
print (df)
var1
a 1
a 1
b 2
b 3
c 3
d 5
dict_cat = {'x' : ['a', 'b', 'c'],'y' : 'd' }
d = {k: oldk for oldk, oldv in dict_cat.items() for k in oldv}
df['cat'] = df.rename(d).index
Or:
df['cat'] = df.index.to_series().map(d)
print (df)
var1 cat
a 1 x
a 1 x
b 2 x
b 3 x
c 3 x
d 5 y
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With