Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum column based on another column in Pandas DataFrame

I have a pandas DataFrame like this:

>>> df = pd.DataFrame({'MONTREGL':[10,10,2222,35,200,56,5555],'SINID':['aaa','aaa','aaa','bbb','bbb','ccc','ccc'],'EXTRA':[400,400,400,500,500,333,333]})
>>> df
   MONTREGL SINID EXTRA
0        10   aaa   400
1        10   aaa   400
2      2222   aaa   400
3        35   bbb   500
4       200   bbb   500
5        56   ccc   333
6      5555   ccc   333

I want to sum the column MONTREGL for each groupby SINID...

So I get 2242 for aaa and so on... ALSO I want to keep the value of column EXTRA.

This is the expected result:

   MONTREGL SINID EXTRA
0      2242   aaa   400
1       235   bbb   500
2      5611   ccc   333

Thanks for your help in advance!

like image 371
Soufiane Sabiri Avatar asked May 29 '19 12:05

Soufiane Sabiri


People also ask

How do you sum values in one column based on values in another column?

(1) Select the column name that you will sum based on, and then click the Primary Key button; (2) Select the column name that you will sum, and then click the Calculate > Sum. (3) Click the Ok button.

How do I get a column value of a Pandas DataFrame based on another column?

You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression.

How do you calculate a new column based on the values of other columns in pandas Python?

Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.


1 Answers

I ended up using this script:

dff = df.groupby(["SINID","EXTRA"]).MONTREGL.sum().reset_index()

And it works in this test and production.

like image 106
Soufiane Sabiri Avatar answered Oct 12 '22 19:10

Soufiane Sabiri