Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

inserting a value into a dataframe based on a comparison condition

I am trying to insert a value into a dataframe based on a comparison with another dataframe. Here is an example:

>>> import pandas as pd
>>> import numpy as np
>>> print(df)
>>> df
      name                          
  0  richard Finn, Tim Maltby       
  1  Fernando Lebrija                          

>>> df2

       Fullname             id
  0   richard Finn          500
  1   Tim Maltby            699
  2   Fernando Lebrija      300

The desired output is :

 >>> df
      name                            id              
  0  richard Finn, Tim Maltby        500,699
  1  Fernando Lebrija                300

I tried using :

df['id'] = np.where((df['name']==df2['Fullname']), df2['id]', df['id'])

but it gives me the following error: `SyntaxError: invalid syntax

like image 365
Sarah john Avatar asked Mar 03 '23 08:03

Sarah john


1 Answers

You can do a split, explode, then map and groupby:

df['id'] = (df['name'].str.split(',\s*')
    .explode()
    .map(df2.set_index('Fullname')['id'])
    .groupby(level=0).agg(list)
)

Output:

                       name          id
0  richard Finn, Tim Maltby  [500, 699]
1          Fernando Lebrija       [300]
like image 169
Quang Hoang Avatar answered Mar 04 '23 22:03

Quang Hoang