pandas add column to groupby dataframe

Question

I have this simple dataframe df:

df = pd.DataFrame({'c':[1,1,1,2,2,2,2],'type':['m','n','o','m','m','n','n']})

my goal is to count values of type for each c, and then add a column with the size of c. So starting with:

In [27]: g = df.groupby('c')['type'].value_counts().reset_index(name='t')  In [28]: g Out[28]:     c type  t 0  1    m  1 1  1    n  1 2  1    o  1 3  2    m  2 4  2    n  2

the first problem is solved. Then I can also:

In [29]: a = df.groupby('c').size().reset_index(name='size')  In [30]: a Out[30]:     c  size 0  1     3 1  2     4

How can I add the size column directly to the first dataframe? So far I used map as:

In [31]: a.index = a['c']  In [32]: g['size'] = g['c'].map(a['size'])  In [33]: g Out[33]:     c type  t  size 0  1    m  1     3 1  1    n  1     3 2  1    o  1     3 3  2    m  2     4 4  2    n  2     4

which works, but is there a more straightforward way to do this?

EdChum · Accepted Answer

Use transform to add a column back to the orig df from a groupby aggregation, transform returns a Series with its index aligned to the orig df:

In [123]: g = df.groupby('c')['type'].value_counts().reset_index(name='t') g['size'] = df.groupby('c')['type'].transform('size') g  Out[123]:    c type  t  size 0  1    m  1     3 1  1    n  1     3 2  1    o  1     3 3  2    m  2     4 4  2    n  2     4

jezrael · Answer

Another solution with transform len:

df['size'] = df.groupby('c')['type'].transform(len) print df    c type size 0  1    m    3 1  1    n    3 2  1    o    3 3  2    m    4 4  2    m    4 5  2    n    4 6  2    n    4

Another solution with Series.map and Series.value_counts:

df['size'] = df['c'].map(df['c'].value_counts()) print (df)    c type  size 0  1    m     3 1  1    n     3 2  1    o     3 3  2    m     4 4  2    m     4 5  2    n     4 6  2    n     4

pandas add column to groupby dataframe

Tags:

python

pandas

Fabio Lamanna

2 Answers

EdChum

jezrael

Recent Activity

Donate For Us

pandas add column to groupby dataframe

Tags:

python

pandas

Fabio Lamanna

2 Answers

EdChum

jezrael

Related questions

Recent Activity

Donate For Us