Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe: groupby one column, but concatenate and aggregate by others [duplicate]

Tags:

python

pandas

How do I turn the below input data (Pandas dataframe fed from Excel file):

ID      Category                    Speaker     Price
334014  Real Estate Perspectives    Tom Smith   100
334014  E&E                         Tom Smith   200
334014  Real Estate Perspectives    Janet Brown 100
334014  E&E                         Janet Brown 200

into this:

ID      Category                    Speaker                 Price
334014  Real Estate Perspectives    Tom Smith, Janet Brown  100
334014  E&E                         Tom Smith, Janet Brown  200   

So basiscally I want to group by Category, concatenate the Speakers, but not aggregate Price.

I tried different approaches with Pandas dataframe.groupby() and .agg(), but to no avail. Maybe there is simpler pure Python solution?

like image 897
barciewicz Avatar asked Oct 15 '25 07:10

barciewicz


2 Answers

There are 2 possible solutions - aggregate by multiple columns and join:

dataframe.groupby(['ID','Category','Price'])['Speaker'].apply(','.join)

Or need aggregate only Price column, then is necessary aggregate all columns by first or last:

dataframe.groupby('Price').agg({'Speaker':','.join, 'ID':'first', 'Price':'first'})
like image 81
jezrael Avatar answered Oct 17 '25 21:10

jezrael


Try this

df.groupby(['ID','Category'],as_index=False).agg(lambda x : x if x.dtype=='int64' else ', '.join(x))
like image 29
Pankaj Sharma Avatar answered Oct 17 '25 21:10

Pankaj Sharma