Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

map multiple columns by a single dictionary in pandas

I have a DataFrame with a multiple columns with 'yes' and 'no' strings. I want all of them to convert to a boolian dtype. To map one column, I would use

dict_map_yn_bool={'yes':True, 'no':False}
df['nearby_subway_station'].map(dict_map_yn_bool)

This would do the job for the one column. how can I replace multiple columns with single line of code?

like image 456
user2958481 Avatar asked May 01 '17 20:05

user2958481


2 Answers

You can use applymap:

df = pd.DataFrame({'nearby_subway_station':['yes','no'], 'Station':['no','yes']})
print (df)
  Station nearby_subway_station
0      no                   yes
1     yes                    no

dict_map_yn_bool={'yes':True, 'no':False}

df = df.applymap(dict_map_yn_bool.get)
print (df)
  Station nearby_subway_station
0   False                  True
1    True                 False

Another solution:

for x in df:
    df[x] = df[x].map(dict_map_yn_bool)
print (df)
  Station nearby_subway_station
0   False                  True
1    True                 False

Thanks Jon Clements for very nice idea - using replace:

df = df.replace({'yes': True, 'no': False})
print (df)
  Station nearby_subway_station
0   False                  True
1    True                 False

Some differences if data are no in dict:

df = pd.DataFrame({'nearby_subway_station':['yes','no','a'], 'Station':['no','yes','no']})
print (df)
  Station nearby_subway_station
0      no                   yes
1     yes                    no
2      no                     a

applymap create None for boolean, strings, for numeric NaN.

df = df.applymap(dict_map_yn_bool.get)
print (df)
  Station nearby_subway_station
0   False                  True
1    True                 False
2   False                  None

map create NaN:

for x in df:
    df[x] = df[x].map(dict_map_yn_bool)

print (df)
  Station nearby_subway_station
0   False                  True
1    True                 False
2   False                   NaN

replace dont create NaN or None, but original data are untouched:

df = df.replace(dict_map_yn_bool)
print (df)
  Station nearby_subway_station
0   False                  True
1    True                 False
2   False                     a
like image 200
jezrael Avatar answered Oct 25 '22 09:10

jezrael


You could use a stack/unstack idiom

df.stack().map(dict_map_yn_bool).unstack()

Using @jezrael's setup

df = pd.DataFrame({'nearby_subway_station':['yes','no'], 'Station':['no','yes']})
dict_map_yn_bool={'yes':True, 'no':False}

Then

df.stack().map(dict_map_yn_bool).unstack()

  Station nearby_subway_station
0   False                  True
1    True                 False

timing
small data

enter image description here

bigger data

enter image description here

like image 27
piRSquared Avatar answered Oct 25 '22 09:10

piRSquared