I have
df = pd.DataFrame.from_dict({'id': ['A', 'B', 'A', 'C', 'D', 'B', 'C'], 'val': [1,2,-3,1,5,6,-2], 'stuff':['12','23232','13','1234','3235','3236','732323']}) id stuff val 0 A 12 1 1 B 23232 2 2 A 13 -3 3 C 1234 1 4 D 3235 5 5 B 3236 6 6 C 732323 -2
I'd like to get running some of val
for each id
, so the desired output looks like this:
id stuff val cumsum 0 A 12 1 1 1 B 23232 2 2 2 A 13 -3 -2 3 C 1234 1 1 4 D 3235 5 5 5 B 3236 6 8 6 C 732323 -2 -1
This is what I tried:
df['cumsum'] = df.groupby('id').cumsum(['val'])
and
df['cumsum'] = df.groupby('id').cumsum(['val'])
This is the error I got:
ValueError: Wrong number of items passed 0, placement implies 1
Pandas DataFrame cumsum() Method The cumsum() method goes through the values in the DataFrame, from the top, row by row, adding the values with the value from the previous row, ending up with a DataFrame where the last row contains the sum of all values for each column.
Sort within Groups of groupby() Result in DataFrameBy using DataFrame. sort_values() , you can sort DataFrame in ascending or descending order, before you use this first group the DataFrame rows by using DataFrame. groupby() method. Note that groupby preserves the order of rows within each group.
How to perform groupby index in pandas? Pass index name of the DataFrame as a parameter to groupby() function to group rows on an index. DataFrame. groupby() function takes string or list as a param to specify the group columns or index.
Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values. It works with non-floating type data as well.
You can call transform
and pass the cumsum
function to add that column to your df:
In [156]: df['cumsum'] = df.groupby('id')['val'].transform(pd.Series.cumsum) df Out[156]: id stuff val cumsum 0 A 12 1 1 1 B 23232 2 2 2 A 13 -3 -2 3 C 1234 1 1 4 D 3235 5 5 5 B 3236 6 8 6 C 732323 -2 -1
With respect to your error, you can't call cumsum
on a Series groupby object, secondly you're passing the name of the column as a list which is meaningless.
So this works:
In [159]: df.groupby('id')['val'].cumsum() Out[159]: 0 1 1 2 2 -2 3 1 4 5 5 8 6 -1 dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With