import pandas as pd
training_data = pd.DataFrame()
training_data['a'] = [401,401.2,410,420,425,426, 426.1]
training_data['b'] = [1,1,2,2,2,3,3]
training_data['condition'] = [True, False, True, True, True,False, False]
My training data:
a b condition
401 1 True
401.2 1 False
410 2 True
420 2 True
425 2 True
426 3 False
426.1 3 False
Desired output:
a b condition
401 2 True (1+1)
410 2 True
420 2 True
425 8 True (2+3+3)
All False conditions have been deleted and column 'b' has been added with the amended values.
How can I get this desired output?
I am aware of using .cumsum()
with
training_data.query('condition').groupby('grp').agg()
You can use DataFrame. apply() for concatenate multiple column values into a single column, with slightly less typing and more scalable when you want to join multiple columns .
Use pandas. concat() to concatenate/merge two or multiple pandas DataFrames across rows or columns. When you concat() two pandas DataFrames on rows, it creates a new Dataframe containing all rows of two DataFrames basically it does append one DataFrame with another.
How to Convert Multiple Rows to Single Row using the Ampersand Sign. With the Ampersand sign “&” you can easily combine multiple rows into a single cell. Following this trick, you can join multiple texts with space as a separator. Here, in this case, B4, B5, and B6 are for the texts.
You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.
Here we go with cumsum
out = training_data.groupby(training_data['condition'].cumsum()).agg({'a':'first','b':'sum','condition':'first'})
Out[271]:
a b condition
condition
1 401.0 2 True
2 410.0 2 True
3 420.0 2 True
4 425.0 8 True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With