Given the following data frame:
import pandas as pd import numpy as np df=pd.DataFrame({'A':['A','A','A','B','B','B'], 'B':['a','a','b','a','a','a'], }) df A B 0 A a 1 A a 2 A b 3 B a 4 B a 5 B a
I'd like to create column 'C', which numbers the rows within each group in columns A and B like this:
A B C 0 A a 1 1 A a 2 2 A b 1 3 B a 1 4 B a 2 5 B a 3
I've tried this so far:
df['C']=df.groupby(['A','B'])['B'].transform('rank')
...but it doesn't work!
Groupby preserves the order of rows within each group. When calling apply, add group keys to index to identify pieces. Reduce the dimensionality of the return type if possible, otherwise return a consistent type.
Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values. It works with non-floating type data as well.
Use groupby/cumcount
:
In [25]: df['C'] = df.groupby(['A','B']).cumcount()+1; df Out[25]: A B C 0 A a 1 1 A a 2 2 A b 1 3 B a 1 4 B a 2 5 B a 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With