I have a df that looks like the following:
id item color 01 truck red 02 truck red 03 car black 04 truck blue 05 car black
I am trying to create a df that looks like this:
item color count truck red 2 truck blue 1 car black 2
I have tried
df["count"] = df.groupby("item")["color"].transform('count')
But it is not quite what I am searching for.
Any guidance is appreciated
Using pandas groupby count() You can also use the pandas groupby count() function which gives the “count” of values in each column for each group. For example, let's group the dataframe df on the “Team” column and apply the count() function. We get a dataframe of counts of values for each group and each column.
Using DataFrame.transform('count') to add a new column containing the groups counts into the DataFrame.
That's not a new column, that's a new DataFrame:
In [11]: df.groupby(["item", "color"]).count() Out[11]: id item color car black 2 truck blue 1 red 2
To get the result you want is to use reset_index
:
In [12]: df.groupby(["item", "color"])["id"].count().reset_index(name="count") Out[12]: item color count 0 car black 2 1 truck blue 1 2 truck red 2
To get a "new column" you could use transform:
In [13]: df.groupby(["item", "color"])["id"].transform("count") Out[13]: 0 2 1 2 2 2 3 1 4 2 dtype: int64
I recommend reading the split-apply-combine section of the docs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With