I am trying to insert a value into a dataframe based on a comparison with another dataframe. Here is an example:
>>> import pandas as pd
>>> import numpy as np
>>> print(df)
>>> df
name
0 richard Finn, Tim Maltby
1 Fernando Lebrija
>>> df2
Fullname id
0 richard Finn 500
1 Tim Maltby 699
2 Fernando Lebrija 300
The desired output is :
>>> df
name id
0 richard Finn, Tim Maltby 500,699
1 Fernando Lebrija 300
I tried using :
df['id'] = np.where((df['name']==df2['Fullname']), df2['id]', df['id'])
but it gives me the following error: `SyntaxError: invalid syntax
You can do a split, explode, then map and groupby:
df['id'] = (df['name'].str.split(',\s*')
.explode()
.map(df2.set_index('Fullname')['id'])
.groupby(level=0).agg(list)
)
Output:
name id
0 richard Finn, Tim Maltby [500, 699]
1 Fernando Lebrija [300]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With