Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Map dictionary values in Pandas

Tags:

python

pandas

I have a data frame (df) with the following:

 var1
a 1 
a 1 
b 2  
b 3 
c 3 
d 5 

And a dictionary:

dict_cat = {
'x' = ['a', 'b', 'c'],
'y' = 'd' }

And I want to create a new column called cat in which depending of the var1 value, it takes the dict key value:

 var1 cat
a 1 x 
a 1 x
b 2 x
b 3 x
c 3 x
d 5 y

I have tried to map the dict to the variable using: df['cat'] = df['var1'].map(dict_cat), but since values are inside a list, Python do not recognize the values and I only get NaN values. There is a way to do this using map, or should I create a function that iterates over rows checking if var1 is in the dictionary lists?

Thanks!

like image 212
topcat Avatar asked Dec 10 '22 07:12

topcat


1 Answers

You need swap keys with values to new dict and then use map:

print (df)
  var1  var2
0    a     1
1    a     1
2    b     2
3    b     3
4    c     3
5    d     5
dict_cat = {'x' : ['a', 'b', 'c'],'y' : 'd' }

d = {k: oldk for oldk, oldv in dict_cat.items() for k in oldv}
print (d)
{'a': 'x', 'b': 'x', 'c': 'x', 'd': 'y'}

df['cat'] = df['var1'].map(d)
print (df)
  var1  var2 cat
0    a     1   x
1    a     1   x
2    b     2   x
3    b     3   x
4    c     3   x
5    d     5   y

If first columns is index is possible use rename or convert index to_series and then use map:

print (df)
   var1
a     1
a     1
b     2
b     3
c     3
d     5

dict_cat = {'x' : ['a', 'b', 'c'],'y' : 'd' }
d = {k: oldk for oldk, oldv in dict_cat.items() for k in oldv}

df['cat'] = df.rename(d).index

Or:

df['cat'] = df.index.to_series().map(d)
print (df)
   var1 cat
a     1   x
a     1   x
b     2   x
b     3   x
c     3   x
d     5   y
like image 119
jezrael Avatar answered Dec 24 '22 10:12

jezrael