python combine rows in dataframe and add up values

Question

I have a dataframe:

 Type:  Volume:
 Q     10
 Q     20 
 T     10 
 Q     10
 T     20
 T     20
 Q     10

and I want to combine type T to one row and add up volume only if two(or more) Ts are consecutive

i.e. to :

 Q    10
 Q    20 
 T    10 
 Q    10 
 T    20+20=40
 Q    10

is there any way to achieve this? would DataFrame.groupby work?

a.deshpande012 · Accepted Answer

I think this will help. This code can handle any number of consecutive 'T's, and you can even change which character to combine. I've added comments in the code to explain what it does.

https://pastebin.com/FakbnaCj

import pandas as pd

def combine(df):
    combined = [] # Init empty list
    length = len(df.iloc[:,0]) # Get the number of rows in DataFrame
    i = 0
    while i < length:
        num_elements = num_elements_equal(df, i, 0, 'T') # Get the number of consecutive 'T's
        if num_elements <= 1: # If there are 1 or less T's, append only that element to combined, with the same type
            combined.append([df.iloc[i,0],df.iloc[i,1]])
        else: # Otherwise, append the sum of all the elements to combined, with 'T' type
            combined.append(['T', sum_elements(df, i, i+num_elements, 1)])
        i += max(num_elements, 1) # Increment i by the number of elements combined, with a min increment of 1
    return pd.DataFrame(combined, columns=df.columns) # Return as DataFrame

def num_elements_equal(df, start, column, value): # Counts the number of consecutive elements
    i = start
    num = 0
    while i < len(df.iloc[:,column]):
        if df.iloc[i,column] == value:
            num += 1
            i += 1
        else:
            return num
    return num

def sum_elements(df, start, end, column): # Sums the elements from start to end
    return sum(df.iloc[start:end, column])

frame = pd.DataFrame({"Type":   ["Q", "Q", "T", "Q", "T", "T", "Q"],
               "Volume": [10,   20,  10,  10,  20,  20,  10]})
print(combine(frame))

jdehesa · Answer

If you just need the partial sums, here is a little trick to do that:

import numpy as np
import pandas as pd

df = pd.DataFrame({"Type":   ["Q", "Q", "T", "Q", "T", "T", "Q"],
                   "Volume": [10,   20,  10,  10,  20,  20,  10]})
s = np.diff(np.r_[0, df.Type == "T"])
s[s < 0] = 0
res = df.groupby(("Type", np.cumsum(s) - 1)).sum().loc["T"]
print(res)

Output:

   Volume
0      10
1      40

python combine rows in dataframe and add up values

Tags:

python

bing

Video Answer

2 Answers

a.deshpande012

jdehesa

Recent Activity

Donate For Us

python combine rows in dataframe and add up values

Tags:

python

bing

Video Answer

2 Answers

a.deshpande012

jdehesa

Related questions

Recent Activity

Donate For Us