Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groupby in python pandas: Fast Way

Tags:

I want to improve the time of a groupby in python pandas. I have this code:

df["Nbcontrats"] = df.groupby(['Client', 'Month'])['Contrat'].transform(len)

The objective is to count how many contracts a client has in a month and add this information in a new column (Nbcontrats).

  • Client: client code
  • Month: month of data extraction
  • Contrat: contract number

I want to improve the time. Below I am only working with a subset of my real data:

%timeit df["Nbcontrats"] = df.groupby(['Client', 'Month'])['Contrat'].transform(len)
1 loops, best of 3: 391 ms per loop

df.shape
Out[309]: (7464, 61)

How can I improve the execution time?