Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

About Pandas Dataframe

I have a question related to Pandas.

In df1 I have a data frame with the id of each seller and their respective names.

In df2 I have the id of the salesmen and their respective sales.

I would like to have in the df2, two new columns with the first name and last names of the salesmen.

PS. in df2 one of the sales is shared between two vendors.

import pandas as pd

vendors = {'first_name': ['Montgomery', 'Dagmar', 'Reeba', 'Shalom', 'Broddy', 'Aurelia'],
         'last_name': ['Humes', 'Elstow', 'Wattisham', 'Alen', 'Keningham', 'Brechin'],
         'id_vendor': [127, 241, 329, 333, 212, 233]}

sales = {'id_vendor': [['127'], ['241'], ['329, 333'], ['212'], ['233']],
         'sales': [1233, 25000, 8555, 4333, 3222]}

df1 = pd.DataFrame(vendors)
df2 = pd.DataFrame(sales)

I attach the code. any suggestions?`

Thank you in advance.

like image 646
Junior P Avatar asked Apr 29 '26 23:04

Junior P


1 Answers

You can merge df1 with df2 with the exploded id_vendors column and use DataFrame.GroupBy.agg when grouping by sales to obtain the columns as you want:

transform_names = lambda x: ', '.join(list(x))

res = (df1.merge(df2.explode('id_vendor')).
       groupby('sales').
       agg({'first_name': transform_names, 'last_name': transform_names, 
            'id_vendor': list})
      )

print(res)
          first_name        last_name   id_vendor
sales                                            
1233      Montgomery            Humes       [127]
3222         Aurelia          Brechin       [233]
4333          Broddy        Keningham       [212]
8555   Reeba, Shalom  Wattisham, Alen  [329, 333]
25000         Dagmar           Elstow       [241]

Note:

In your example, id_vendors in df2 is populated by lists of strings, but since id_vendor in df1 is of integer type, I assume that it was a typo. If id_vendors is indeed containing lists of strings, you need to also convert the strings to integers:

transform_names = lambda x: ', '.join(list(x))

# Notice the .astype(int) call.
res = (df1.merge(df2.explode('id_vendor').astype(int)).
       groupby('sales').
       agg({'first_name': transform_names, 'last_name': transform_names, 
            'id_vendor': list})
      )

print(res)
like image 94
user2246849 Avatar answered May 02 '26 12:05

user2246849



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!