Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing multiple values per cell in Pandas

I have the following column in a dataframe:

Q2
1 4
1 3
3 4 11 
1 4 6 15 16

I want to replace mutiple values in a cell, if present: 1 gets replaced by Facebook, 2 with Instagram, and so on.

I splitted the values as follows:

columns_to_split = 'Q2'

for c in columns_to_split:
    df[c] = df[c].str.split(' ')

which outputs

code                             
DSOKF31                          [1, 4]
DSOVH39                          [1, 3]
DSOVH05                          [3, 4, 16]
DSOVH23                          [1, 4, 6, 15, 16]
Name: Q2, dtype: object

but when trying to replace the multiple values with a dictionary as follows:

social_media_2 = {'1':'Facebook', '2':'Instagram', '3':'Twitter', '4':'Messenger (Google hangout, Tagg, WhatsAPP, MSG, Facetime, IMO)', '5':'SnapChat', '6':'Imo', '7':'Badoo', '8':'Viber', '9':'Twoo', '10':'Linkedin', '11':'Flickr', '12':'Meetup', '13':'Tumblr', '14':'Pinterest', '15':'Yahoo', '16':'Gmail', '17':'Hotmail', '18':'M-Pesa', '19':'M-Shwari', '20':'KCB-Mpesa', '21':'Equitel', '22':'MobiKash', '23':'Airtel money', '24':'Orange Money', '25':'Mobile Bankig Accounts', '26':'Other specify'}

df['Q2'] = df['Q2'].replace(social_media_2)

I get the same output:

code                             
DSOKF31                          [1, 4]
DSOVH39                          [1, 3]
DSOVH05                          [3, 4, 16]
DSOVH23                          [1, 4, 6, 15, 16]
Name: Q2, dtype: object

How do I replace multiple values in one cell in this case?

like image 673
John Boss Avatar asked Oct 24 '25 18:10

John Boss


2 Answers

If dont need list as output add only regex=True to replace:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Q2': ['1 4', '1 3', '3 4 11']})
print (df)
       Q2
0     1 4
1     1 3
2  3 4 11

social_media_2 = {'1':'Facebook', '2':'Instagram', '3':'Twitter', '4':'Messenger (Google hangout, Tagg, WhatsAPP, MSG, Facetime, IMO)', '5':'SnapChat', '6':'Imo', '7':'Badoo', '8':'Viber', '9':'Twoo', '10':'Linkedin', '11':'Flickr', '12':'Meetup', '13':'Tumblr', '14':'Pinterest', '15':'Yahoo', '16':'Gmail', '17':'Hotmail', '18':'M-Pesa', '19':'M-Shwari', '20':'KCB-Mpesa', '21':'Equitel', '22':'MobiKash', '23':'Airtel money', '24':'Orange Money', '25':'Mobile Bankig Accounts', '26':'Other specify'}
df['Q2'] = df['Q2'].replace(social_media_2, regex=True)
print (df)

                                                  Q2
0  Facebook Messenger (Google hangout, Tagg, What...
1                                   Facebook Twitter
2  Twitter Messenger (Google hangout, Tagg, Whats...

If need lists, use another solutions.

EDIT by comment:

You can replace whitespace by ; and then it works nice:

df = pd.DataFrame({'Q2': ['1 4', '1 3', '3 4 11']})
print (df)
       Q2
0     1 4
1     1 3
2  3 4 11

df['Q2'] = df['Q2'].str.replace(' ',';')
print (df)
       Q2
0     1;4
1     1;3
2  3;4;11

social_media_2 = {'1':'Facebook', '2':'Instagram', '3':'Twitter', '4':'Messenger (Google hangout, Tagg, WhatsAPP, MSG, Facetime, IMO)', '5':'SnapChat', '6':'Imo', '7':'Badoo', '8':'Viber', '9':'Twoo', '10':'Linkedin', '11':'Flickr', '12':'Meetup', '13':'Tumblr', '14':'Pinterest', '15':'Yahoo', '16':'Gmail', '17':'Hotmail', '18':'M-Pesa', '19':'M-Shwari', '20':'KCB-Mpesa', '21':'Equitel', '22':'MobiKash', '23':'Airtel money', '24':'Orange Money', '25':'Mobile Bankig Accounts', '26':'Other specify'}
df['Q2'] = df['Q2'].replace(social_media_2, regex=True)
print (df)
                                                  Q2
0  Facebook;Messenger (Google hangout, Tagg, What...
1                                   Facebook;Twitter
2  Twitter;Messenger (Google hangout, Tagg, Whats...

EDIT1:

Tou can also a bit change dict by adding ; to keys and then replace by double ;:

df = pd.DataFrame({'Q2': ['1 2', '1 3', '3 2 11']})
print (df)
       Q2
0     1 2
1     1 3
2  3 2 11

df['Q2'] = df['Q2'].str.replace(' ',';;') + ';'
print (df)
          Q2
0      1;;2;
1      1;;3;
2  3;;2;;11;

social_media_2 = {'1':'Fa', '2':'I', '3':'T', '11':'KL'}
#add ; to keys in dict
social_media_2 = dict((key + ';', value) for (key, value) in social_media_2.items())
print (social_media_2)
{'1;': 'Fa', '2;': 'I', '3;': 'T', '11;': 'KL'}
df['Q2'] = df['Q2'].replace(social_media_2, regex=True)
print (df)
        Q2
0     Fa;I
1     Fa;T
2  T;I;1Fa
like image 78
jezrael Avatar answered Oct 26 '25 09:10

jezrael


Since the number of items is varying, there isn't a lot of structure. Still, after you split the string, you can apply a function that maps a list into dictionary values:

In [36]: df = pd.DataFrame({'Q2': ['1 4', '1 3', '1 2 3']})

In [37]: df.Q2.str.split(' ').apply(lambda l: [social_media_2[e] for e in l])
Out[37]: 
0    [Facebook, Messenger (Google hangout, Tagg, Wh...
1                                  [Facebook, Twitter]
2                       [Facebook, Instagram, Twitter]
Name: Q2, dtype: object

Edit Following Jezrael's excellent comment, here's a version that accounts for missing values as well:

In [58]: df = pd.DataFrame({'Q2': ['1 4', '1 3', '1 2 3', None]})

In [59]: df.Q2.str.split(' ').apply(lambda l: [] if type(l) != list else [social_media_2[e] for e in l])
Out[59]: 
0    [Facebook, Messenger (Google hangout, Tagg, Wh...
1                                  [Facebook, Twitter]
2                       [Facebook, Instagram, Twitter]
3                                                   []
Name: Q2, dtype: object
like image 26
Ami Tavory Avatar answered Oct 26 '25 08:10

Ami Tavory



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!