Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas : Pivot table : aggfunc concatenate instead of np.size or np.sum

I have some entries in dataframe like :

name, age, phonenumber
 A,10, Phone1
 A,10,Phone2
 B,21,PhoneB1
 B,21,PhoneB2
 C,23,PhoneC

Here is what I am trying to achieve as result of pivot table:

 name, age, phonenumbers, phonenocount
 A,10, "Phone1,Phone2" , 2
 B,21,  "PhoneB1,PhoneB2", 2
 C,23, "PhoneC" , 1

I was trying something like :

pd.pivot_table(phonedf, index=['name','age','phonenumbers'], values=['phonenumbers'], aggfunc=np.size)

however I want the phone numbers to be concatenated as part of aggfunc. Any Suggestions ?

like image 371
Sivaswami Jeganathan Avatar asked Aug 16 '16 18:08

Sivaswami Jeganathan


1 Answers

You can use agg function after the groupby:

df.groupby(['name', 'age'])['phonenumber'].\
    agg({'phonecount': pd.Series.nunique, 
         'phonenumber': lambda x: ','.join(x)
        }
       )

#               phonenumber  phonecount
# name  age     
#    A   10   Phone1,Phone2           2
#    B   21 PhoneB1,PhoneB2           2
#    C   23          PhoneC           1

Or a shorter version according to @root and @Jon Clements:

df.groupby(['name', 'age'])['phonenumber'].\
   agg({'phonecount': 'nunique', 'phonenumber': ','.join})
like image 66
Psidom Avatar answered Oct 15 '22 06:10

Psidom