Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grouping and auto increment based on columns in pandas

Tags:

python

pandas

i have a pandas frame that looks like this:

enter image description here

Is there a way to add the numbers in the last column without having to iterate through the data frame?

I was playing with the results of Grouping and auto incrementing group id in pandas but haven't made it work for my purposes

Here is the code to produce the dataframe

import pandas as pd
columns = ['Product','SubProd', 'NeedThis']
Index=['4/20/2012','4/27/2012','5/4/2012','5/11/2012','5/18/2012','4/20/2012',
'4/27/2012','5/4/2012','5/11/2012','5/18/2012','5/25/2012','10/31/2014','11/7/2014',
'11/14/2014','11/21/2014','11/28/2014']
datas = {'Product' : ['A','A','A','A','A','A','A','A','A','A','A','B','B','B','B','B'],
      'SubProd' : ['BL','BL','BL','BL','BL','lk','lk','lk','lk','lk','lk','po','po','po','po','po']}
df = pd.DataFrame(data=datas, index=Index)
print(df)

Output:

           Product SubProd
4/20/2012        A      BL
4/27/2012        A      BL
5/4/2012         A      BL
5/11/2012        A      BL
5/18/2012        A      BL
4/20/2012        A      lk
4/27/2012        A      lk
5/4/2012         A      lk
5/11/2012        A      lk
5/18/2012        A      lk
5/25/2012        A      lk
10/31/2014       B      po
11/7/2014        B      po
11/14/2014       B      po
11/21/2014       B      po
11/28/2014       B      po

Thanks

like image 735
kizofilax Avatar asked Oct 07 '15 17:10

kizofilax


People also ask

How do you group by on multiple columns in Pandas?

pandas GroupBy Multiple Columns Example Most of the time when you are working on a real-time project in pandas DataFrame you are required to do groupby on multiple columns. You can do so by passing a list of column names to DataFrame. groupby() function.

How do I Group column values in Pandas?

The Hello, World! of pandas GroupBy You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .

How do you group by and sum multiple columns in Pandas?

Use DataFrame. groupby(). sum() to group rows based on one or multiple columns and calculate sum agg function. groupby() function returns a DataFrameGroupBy object which contains an aggregate function sum() to calculate a sum of a given column for each group.


1 Answers

In [10]: df['counter'] = df.groupby(['Product','SubProd']).cumcount()+1

In [11]: df
Out[11]: 
           Product SubProd  counter
4/20/2012        A      BL        1
4/27/2012        A      BL        2
5/4/2012         A      BL        3
5/11/2012        A      BL        4
5/18/2012        A      BL        5
4/20/2012        A      lk        1
4/27/2012        A      lk        2
5/4/2012         A      lk        3
5/11/2012        A      lk        4
5/18/2012        A      lk        5
5/25/2012        A      lk        6
10/31/2014       B      po        1
11/7/2014        B      po        2
11/14/2014       B      po        3
11/21/2014       B      po        4
11/28/2014       B      po        5
like image 76
Jeff Avatar answered Sep 19 '22 07:09

Jeff