Pandas number rows within group in increasing order

Tags:

Given the following data frame:

import pandas as pd import numpy as np df=pd.DataFrame({'A':['A','A','A','B','B','B'],                 'B':['a','a','b','a','a','a'],                 }) df      A   B 0   A   a  1   A   a  2   A   b  3   B   a  4   B   a  5   B   a

I'd like to create column 'C', which numbers the rows within each group in columns A and B like this:

    A   B   C 0   A   a   1 1   A   a   2 2   A   b   1 3   B   a   1 4   B   a   2 5   B   a   3

I've tried this so far:

df['C']=df.groupby(['A','B'])['B'].transform('rank')

...but it doesn't work!

372

asked Jun 23 '16 16:06

Dance Party2

1 Answers

Use groupby/cumcount:

In [25]: df['C'] = df.groupby(['A','B']).cumcount()+1; df Out[25]:     A  B  C 0  A  a  1 1  A  a  2 2  A  b  1 3  B  a  1 4  B  a  2 5  B  a  3

answered Sep 28 '22 10:09

unutbu

Related questions
                            
                                Python: create a pandas data frame from a list
                            
                                How to get everything from the list except the first element using list slicing [duplicate]
                            
                                Converting String to Int using try/except in Python
                            
                                How to measure server response time for Python requests POST-request
                            
                                Python mock Patch os.environ and return value
                            
                                NameError: name 'List' is not defined
                            
                                How do I unescape HTML entities in a string in Python 3.1? [duplicate]
                            
                                spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)
                            
                                I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."?
                            
                                What's the difference between namedtuple and NamedTuple?
                            
                                what does --enable-optimizations do while compiling python?
                            
                                Different object size of True and False in Python 3
                            
                                Convert float to string in positional format (without scientific notation and false precision)
                            
                                import httplib ImportError: No module named httplib
                            
                                Does Python have a cleaner way to express "if x contains a|b|c|d..."? [duplicate]
                            
                                Local variable referenced before assignment?
                            
                                Python3 Determine if two dictionaries are equal [duplicate]
                            
                                Why is Python 3 not backwards compatible? [closed]
                            
                                built-in range or numpy.arange: which is more efficient?
                            
                                ":=" syntax and assignment expressions: what and why?

Pandas number rows within group in increasing order

Tags:

python-3.x

pandas

group-by

pandas-groupby

rank

Dance Party2

People also ask

1 Answers

unutbu

Recent Activity

Donate For Us