Pandas create new column with count from groupby

Tags:

python

pandas

I have a df that looks like the following:

id        item        color 01        truck       red 02        truck       red 03        car         black 04        truck       blue 05        car         black

I am trying to create a df that looks like this:

item      color       count truck     red          2 truck     blue         1 car       black        2

I have tried

df["count"] = df.groupby("item")["color"].transform('count')

But it is not quite what I am searching for.

Any guidance is appreciated

754

asked Apr 24 '15 00:04

GNMO11

1 Answers

That's not a new column, that's a new DataFrame:

In [11]: df.groupby(["item", "color"]).count() Out[11]:              id item  color car   black   2 truck blue    1       red     2

To get the result you want is to use reset_index:

In [12]: df.groupby(["item", "color"])["id"].count().reset_index(name="count") Out[12]:     item  color  count 0    car  black      2 1  truck   blue      1 2  truck    red      2

To get a "new column" you could use transform:

In [13]: df.groupby(["item", "color"])["id"].transform("count") Out[13]: 0    2 1    2 2    2 3    1 4    2 dtype: int64

I recommend reading the split-apply-combine section of the docs.

126

answered Sep 19 '22 03:09

Andy Hayden

Related questions
                            
                                TypeError: 'int' object does not support indexing
                            
                                View RDD contents in Python Spark?
                            
                                What is a tuple useful for?
                            
                                Edit Distance in Python
                            
                                How to get the range of valid Numpy data types?
                            
                                Merge two objects in Python
                            
                                Plotting grouped data in same plot using Pandas
                            
                                create anaconda python environment with all packages
                            
                                How do I prevent fixtures from conflicting with django post_save signal code?
                            
                                Check if a predicate evaluates true for all elements in an iterable in Python
                            
                                Replace -inf with zero value
                            
                                What is the difference between sets and lists in Python?
                            
                                How to merge two dataframes side-by-side?
                            
                                Convert unicode string dictionary into dictionary in python
                            
                                Simple Python Challenge: Fastest Bitwise XOR on Data Buffers
                            
                                Why is the time complexity of python's list.append() method O(1)?
                            
                                Two variables in Python have same id, but not lists or tuples
                            
                                PyCharm Unresolved reference 'print' [closed]
                            
                                Rendered HTML to plain text using Python
                            
                                Casting an int to a string in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With