Grouping one column using agg & join but only on unique values

Question

I have this cunning piece of code that i'm using on the following dataset

    df = pd.DataFrame({
    'contact_email': ['info@info.com', 'info@info.com', 'info@info.com'], 
    'interest': ['Math', 'Science', 'Science']
})
    print(df)
    interest contact_email
0   Math    info@info.com
1   Science info@info.com
2   Science info@info.com

df = df.groupby('Contact_Email').agg({'interest' : ' '.join}).reset_index()
print(df)

        contact_email   AOI
0   info@info.com   Math Science Science

this is so close to what I wanted, but i need to only return unique interest. (I have users/customers entering the same form, with the same values almost 10 times!)

also, as a nice to have does anyone know how to remove the 0,1,2,3 index.

Thanks!

jezrael · Accepted Answer

Use unique for remove duplicates:

df = (df.groupby('contact_email')
        .agg({'interest' : lambda x: ' '.join(x.unique())})
        .reset_index())
print(df)
   contact_email      interest
0  info@info.com  Math Science

Or sets, but order of values should be changed:

df = df.groupby('contact_email').agg({'interest' : lambda x: ' '.join(set(x))}).reset_index()
print(df)
   contact_email      interest
0  info@info.com  Math Science

Or drop_duplicates:

df = (df.drop_duplicates(subset=['contact_email','interest'])
       .groupby('contact_email')
       .agg({'interest' : ' '.join})
       .reset_index())
print(df)
   contact_email      interest
0  info@info.com  Math Science

jpp · Answer

Since you have only one function, you can use groupby + apply and utilize set:

res = df.groupby('contact_email')['interest']\
        .apply(set).apply(' '.join)\
        .reset_index()

print(res)

   contact_email      interest
0  info@info.com  Math Science

Grouping one column using agg & join but only on unique values

Tags:

python

pandas

pandas-groupby

Umar.H

2 Answers

jezrael

jpp

Recent Activity

Donate For Us

Grouping one column using agg & join but only on unique values

Tags:

python

pandas

pandas-groupby

Umar.H

2 Answers

jezrael

jpp

Related questions

Recent Activity

Donate For Us