I have the following example data, and I'd like to filter a piece of data, when (col1 = 'A' and col2 = '0') we want to keep rows until next (col1 = 'A').
I want to do using pandas dataframe but I don't know how it is.
df = pd.DataFrame({'col1': ['A', 'B', 'C'], 'col2': [0, 1]})
For example, we have this data
col1 col2
A 0
C
A 1
B
C
A 1
B
B
C
A 0
B
C
A 1
B
C
C
The result I want to achieve is:
col1 col2
A 0
C
A 0
B
C
Thank you very much
You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows.
Syntax. The FILTER function filters an array based on a Boolean (True/False) array. Notes: An array can be thought of as a row of values, a column of values, or a combination of rows and columns of values.
We first groupby
row blocks starting with 'A'
and then propagate the first value of col2
to all rows of the group. From this result we take all rows with 0
in col2
.
df[df.groupby(df.col1.eq('A').cumsum()).col2.transform('first').eq(0)]
Sample data:
df = pd.DataFrame({'col1': list('ACABCABBCABCABCC'),
'col2': [0, None, 1, None, None, 1, None, None, None, 0, None, None, 1, None, None, None]}
).astype({'col2': 'Int32'})
Result:
col1 col2
0 A 0
1 C <NA>
9 A 0
10 B <NA>
11 C <NA>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With