Better alternative to a groupby with a merge [duplicate]

Question

I was wondering if anyone knew of a better method to what I am currently doing. Here is an example data set:

ID  Number
a   1
a   2
a   3
b   4
c   5
c   6
c   7
c   8

Example: if I wanted to get a count of Numbers by ID column in the table above. I would first do a groupby ID and do a count on Number, then merge the results back to the original table like so:

df2 = df.groupby('ID').agg({'Number':'count'}).reset_index()

df2 = df2.rename(columns = {'Number':'Number_Count'})

df = pd.merge(df, df2, on = ['ID'])

This results in:

enter image description here

It feels like a roundabout way of doing this, does anyone know a better alternative? The reason I ask is because when working with large data sets, this method can chew up a lot of memory (by creating another table and then merging them).

zipa · Accepted Answer

You can do that quite simply with this:

import pandas as pd

df = pd.DataFrame({'ID': list('aaabcccc'),
                   'Number': range(1,9)})

df['Number_Count'] = df.groupby('ID').transform('count')

df

#  ID  Number  Number_Count
#0  a       1             3
#1  a       2             3
#2  a       3             3
#3  b       4             1
#4  c       5             4
#5  c       6             4
#6  c       7             4
#7  c       8             4

Better alternative to a groupby with a merge [duplicate]

Tags:

python

merge

pandas

group-by

Brian

1 Answers

zipa

Recent Activity

Donate For Us

Better alternative to a groupby with a merge [duplicate]

Tags:

python

merge

pandas

group-by

Brian

1 Answers

zipa

Related questions

Recent Activity

Donate For Us