I am new to pandas and I have googled my problem but did not get any help.
Problem Statement:
When I save final CSV using df.to_csv() after performing cumsum() on one of the column amount, My CSV only showing one column amount but I want all the columns in my final CSV.
Sample Data:
*------------------------------------------------*
|effective_date | account_id | currency | amount |
*------------------------------------------------*
| 12/26/19 1 USD 50 |
| 12/27/19 1 USD 70 |
| 11/06/19 2 USD 90 |
| 11/07/19 2 USD 30 |
*------------------------------------------------*
My Code using Jupyter Notebook:
import pandas as pd
df = pd.read_csv('payments.csv', index_col=0)
df['effective_when'] = pd.to_datetime(df['effective_when'])
df = df.groupby(['account_id', 'currency', 'effective_date']).sum().groupby(level=[0]).cumsum()
df.to_csv ('cumulativePayments.csv')
Current Result:
*------*
|amount|
*------*
| 50 |
| 120 |
| 90 |
| 120 |
*------*
Expected Result:
*------------------------------------------------*
|effective_date | account_id | currency | amount |
*------------------------------------------------*
| 12/26/19 1 USD 50 |
| 12/27/19 1 USD 120 |
| 11/06/19 2 USD 90 |
| 11/07/19 2 USD 120 |
*------------------------------------------------*
how can I achieve this?
I think you can split code to create DataFrame with aggregation sum and then reaasign amount columns filled by cumulative sums:
df = pd.read_csv('payments.csv', index_col=0)
df['effective_date'] = pd.to_datetime(df['effective_date'])
df = df.groupby(['account_id', 'currency', 'effective_date'], as_index=False).sum()
df['amount'] = df.groupby('account_id')['amount'].cumsum()
print (df)
account_id currency effective_date amount
0 1 USD 2019-12-26 50
1 1 USD 2019-12-27 120
2 2 USD 2019-11-06 90
3 2 USD 2019-11-07 120
df.to_csv('cumulativePayments.csv')
Another idea should be convert first column to datetimes and for write to file remove default index values:
df = pd.read_csv('payments.csv', parse_dates=[0])
df = df.groupby(['account_id', 'currency', 'effective_date'], as_index=False).sum()
df['amount'] = df.groupby('account_id')['amount'].cumsum()
df.to_csv('cumulativePayments.csv', index=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With