I have this simple dataframe df
:
df = pd.DataFrame({'c':[1,1,1,2,2,2,2],'type':['m','n','o','m','m','n','n']})
my goal is to count values of type
for each c
, and then add a column with the size of c
. So starting with:
In [27]: g = df.groupby('c')['type'].value_counts().reset_index(name='t') In [28]: g Out[28]: c type t 0 1 m 1 1 1 n 1 2 1 o 1 3 2 m 2 4 2 n 2
the first problem is solved. Then I can also:
In [29]: a = df.groupby('c').size().reset_index(name='size') In [30]: a Out[30]: c size 0 1 3 1 2 4
How can I add the size
column directly to the first dataframe? So far I used map
as:
In [31]: a.index = a['c'] In [32]: g['size'] = g['c'].map(a['size']) In [33]: g Out[33]: c type t size 0 1 m 1 3 1 1 n 1 3 2 1 o 1 3 3 2 m 2 4 4 2 n 2 4
which works, but is there a more straightforward way to do this?
Use transform
to add a column back to the orig df from a groupby
aggregation, transform
returns a Series
with its index aligned to the orig df:
In [123]: g = df.groupby('c')['type'].value_counts().reset_index(name='t') g['size'] = df.groupby('c')['type'].transform('size') g Out[123]: c type t size 0 1 m 1 3 1 1 n 1 3 2 1 o 1 3 3 2 m 2 4 4 2 n 2 4
Another solution with transform
len
:
df['size'] = df.groupby('c')['type'].transform(len) print df c type size 0 1 m 3 1 1 n 3 2 1 o 3 3 2 m 4 4 2 m 4 5 2 n 4 6 2 n 4
Another solution with Series.map
and Series.value_counts
:
df['size'] = df['c'].map(df['c'].value_counts()) print (df) c type size 0 1 m 3 1 1 n 3 2 1 o 3 3 2 m 4 4 2 m 4 5 2 n 4 6 2 n 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With