Cumulative sum only applying on 1 column python

Question

I would like to apply cumsum on 1 specific column only since I have got other values in different columns that must stay the same.

This is the script that I have so far

df.groupby(by=['name','day']).sum().groupby(level=[0]).cumsum()

However this script results in that all of my columns in my pandas df will cumulate. The only column which must cumulate sum is data.

As requested, here is some sample data:

df = pd.DataFrame({'ID': ["880022443344556677787", "880022443344556677782", "880022443344556677787",
                          "880022443344556677782", "880022443344556677787", "880022443344556677782",
                          "880022443344556677781"],
                   'Month': ["201701", "201701", "201702", "201702", "201703", "201703", "201703"],
                   'Usage': [20, 40, 100, 50, 30, 30, 2000],
                   'Sec': [10, 15, 20, 1, 5, 6, 30]})

                      ID   Month  Sec  Usage
0  880022443344556677787  201701   10     20
1  880022443344556677782  201701   15     40
2  880022443344556677787  201702   20    100
3  880022443344556677782  201702    1     50
4  880022443344556677787  201703    5     30
5  880022443344556677782  201703    6     30
6  880022443344556677781  201703   30   2000

Desired output

                      ID   Month  Sec  Usage
0  880022443344556677787  201701   10     20
1  880022443344556677782  201701   15     40
2  880022443344556677787  201702   20    120
3  880022443344556677782  201702    1     90
4  880022443344556677787  201703    5    150
5  880022443344556677782  201703    6    120
6  880022443344556677781  201703   30   2000

piRSquared · Accepted Answer

Consider the dataframe df

df = pd.DataFrame(dict(
        name=list('aaaaaaaabbbbbbbb'),
        day=np.tile(np.arange(2).repeat(4), 2),
        data=np.arange(16)
    ))

First, you perform your cumsum over a specific column by naming the column after the groupby statement.

Second, you can add this back to the dataframe df with join

d2 = df.groupby(['name', 'day']).data.sum().groupby(level=0).cumsum()

df.join(d2, on=['name', 'day'], rsuffix='_cum')

    data  day name  data_cum
0      0    0    a         6
1      1    0    a         6
2      2    0    a         6
3      3    0    a         6
4      4    1    a        28
5      5    1    a        28
6      6    1    a        28
7      7    1    a        28
8      8    0    b        38
9      9    0    b        38
10    10    0    b        38
11    11    0    b        38
12    12    1    b        92
13    13    1    b        92
14    14    1    b        92
15    15    1    b        92

Cumulative sum only applying on 1 column python

Tags:

python

pandas

cumulative-sum

Joe_ft

1 Answers

piRSquared

Recent Activity

Donate For Us

Cumulative sum only applying on 1 column python

Tags:

python

pandas

cumulative-sum

Joe_ft

1 Answers

piRSquared

Related questions

Recent Activity

Donate For Us