Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas number rows within group in increasing order

Given the following data frame:

import pandas as pd import numpy as np df=pd.DataFrame({'A':['A','A','A','B','B','B'],                 'B':['a','a','b','a','a','a'],                 }) df      A   B 0   A   a  1   A   a  2   A   b  3   B   a  4   B   a  5   B   a 

I'd like to create column 'C', which numbers the rows within each group in columns A and B like this:

    A   B   C 0   A   a   1 1   A   a   2 2   A   b   1 3   B   a   1 4   B   a   2 5   B   a   3 

I've tried this so far:

df['C']=df.groupby(['A','B'])['B'].transform('rank') 

...but it doesn't work!

like image 372
Dance Party2 Avatar asked Jun 23 '16 16:06

Dance Party2


People also ask

Does pandas Groupby maintain order?

Groupby preserves the order of rows within each group. When calling apply, add group keys to index to identify pieces. Reduce the dimensionality of the return type if possible, otherwise return a consistent type.

How do you count in Groupby pandas?

Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values. It works with non-floating type data as well.


1 Answers

Use groupby/cumcount:

In [25]: df['C'] = df.groupby(['A','B']).cumcount()+1; df Out[25]:     A  B  C 0  A  a  1 1  A  a  2 2  A  b  1 3  B  a  1 4  B  a  2 5  B  a  3 
like image 50
unutbu Avatar answered Sep 28 '22 10:09

unutbu