This is my first question on StackOverflow, so let me know if I should formulate anything differently...
I want to replace some values in a pandas Dataframe column, dependent on a condition related to values in another column, but leave the original values if the condition is False. For example:
import pandas as pd
df=pd.DataFrame({'col1':['A','B','C','B'], 'col2':['z','x','x','x']},
columns=['col1','col2'])
df =
col1 col2
0 A z
1 B x
2 C x
3 B x
Say, I want to replace the values in col2 to 'q' if the value in col1 is 'B' or 'C', but leave the original values ('z','x'), if the value in col1 is not B or C. In reality i have much larger DataFrame with hundreds of unique values in col1, and want to replace the values in col2 for about 20 of them.
My current solution is to create a dictionary, using col1 as keys and col2 as values, and then:
dict1.update({'B':'q'})
df[col2] = df[col1].map(dict1)
But this trick only works if values in the two columns correlate exactly (or if values in col1 are unique).
So i was wondering if there is a more elegant solution. Thus only replace value in col2 if col1 matches a certain condition, else leave the original value.
mask the df first using loc
and isin
and call map
as before:
In [376]:
dict1 = {'B':'q'}
df.loc[df['col1'].isin(dict1.keys()), 'col2'] = df['col1'].map(dict1)
df
Out[376]:
col1 col2
0 A z
1 B q
2 C x
3 B q
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With