Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe apply function to column strings based on other column value

Tags:

python

pandas

I would like to remove all instance of the string in col 'B' from col 'A', like so:

col A                 col B    col C
1999 toyota camry     camry    1999 toyota 
2003 nissan pulsar    pulsar   20013 nissan

How would I do this using pandas? If it was a fixed value (non-dependent on another column), I would use:

df['col C'] = df['col A'].str.replace('value-to-replace','')
like image 971
Testy8 Avatar asked Apr 03 '16 09:04

Testy8


2 Answers

Given a DataFrame of:

df = pd.DataFrame(
    {
        'A': ['1999 toyota camry', '2003 nissan pulsar'],
        'B': ['camry', 'pulsar']
    }
)

You can df.apply over the row axis and perform the replacement:

df['C'] = df.apply(lambda L: L.A.replace(L.B, ''), axis=1)

This'll give you:

                    A       B             C
0   1999 toyota camry   camry  1999 toyota 
1  2003 nissan pulsar  pulsar  2003 nissan 
like image 190
Jon Clements Avatar answered Oct 16 '22 10:10

Jon Clements


Suppose you have a dataframe:

df

               col A    col B
0   1999 toyota camry   camry
1   2003 nissan pulsar  pulsar

Then you may proceed as follows:

df['col C'] = [el[0].replace(el[1],'') for el in zip(df['col A'],df['col B'])]
df

                col A   col B         col C
0   1999 toyota camry   camry   1999 toyota
1   2003 nissan pulsar  pulsar  2003 nissan
like image 44
Sergey Bushmanov Avatar answered Oct 16 '22 08:10

Sergey Bushmanov