I have a pandas DataFrame like this:
>>> df = pd.DataFrame({'MONTREGL':[10,10,2222,35,200,56,5555],'SINID':['aaa','aaa','aaa','bbb','bbb','ccc','ccc'],'EXTRA':[400,400,400,500,500,333,333]})
>>> df
MONTREGL SINID EXTRA
0 10 aaa 400
1 10 aaa 400
2 2222 aaa 400
3 35 bbb 500
4 200 bbb 500
5 56 ccc 333
6 5555 ccc 333
I want to sum the column MONTREGL
for each groupby SINID
...
So I get 2242 for aaa and so on... ALSO I want to keep the value of column EXTRA
.
This is the expected result:
MONTREGL SINID EXTRA
0 2242 aaa 400
1 235 bbb 500
2 5611 ccc 333
Thanks for your help in advance!
(1) Select the column name that you will sum based on, and then click the Primary Key button; (2) Select the column name that you will sum, and then click the Calculate > Sum. (3) Click the Ok button.
You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression.
Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.
I ended up using this script:
dff = df.groupby(["SINID","EXTRA"]).MONTREGL.sum().reset_index()
And it works in this test and production.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With