Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas combine column values for similar rows

Tags:

python

pandas

I have a dataframe with similar rows with a unique column value. If any of the rows have a duplicate combination of values, I need to concatenate the unique values into a column for each row.

Sample Data

| program | subject | course | title |
|:------- |:------- |:------ |:----- |
|music    | eng     | 101    | 000   |
|music    | math    | 101    | 123   |
|music    | eng     | 102    | 000   |
|music    | math    | 101    | 456   |
|art      | span    | 201    | 123   |
|art      | hst     | 101    | 000   |
|art      | span    | 201    | 456   |
|art      | span    | 202    | 000   |

Desired Data

| program | subject | course | title.   |
|:------- |:------- |:------ |:-----    |
|music    | eng     | 101    | 000      |
|music    | math    | 101    | 123-456  |
|music    | eng     | 102    | 000      |
|music    | math    | 101    | 456-123  |
|art      | span    | 201    | 123-456  |
|art      | hst     | 101    | 000      |
|art      | span    | 201    | 456-123  |
|art      | span    | 202    | 000      |

The first three columns in the 2nd and 4th as well as the 5th and 7th rows match. I want to concatenate the titles so each row contains a combination of titles for matching rows.

like image 214
JElwood Avatar asked Sep 08 '25 07:09

JElwood


1 Answers

Let's try groupby transform:

df['title'] = df.groupby(
    ['program', 'subject', 'course'], as_index=False, sort=False
)['title'].transform('-'.join)
print(df)

Output:

  program subject  course    title
0   music     eng     101      000
1   music    math     101  123-456
2   music     eng     102      000
3   music    math     101  123-456
4     art    span     201  123-456
5     art     hst     101      000
6     art    span     201  123-456
7     art    span     202      000
like image 148
Henry Ecker Avatar answered Sep 10 '25 10:09

Henry Ecker