I have a df
A = pd.DataFrame([[1, 5, 2, 0], [2, 4, 4, 0], [3, 3, 1, 1], [4, 2, 2, 0], [5, 1, 4, 0], [2, 4, 4, 0], [3, 3, 1, 1], [4, 2, 2, 0], [5, 1, 4, 0]],
columns=['A', 'B', 'C', 'D'], index=[1, 2, 3, 4, 5, 6, 7, 8, 9])
I want to be able to subset the dataframe according to the following rules: Select the rows which the column 'D' value is 1 and also include the two above them (Chunk Size = 3).
If I apply the rule in the df example, the output should be:
A B C D
1 1 5 2 0
2 2 4 4 0
3 3 3 1 1
5 5 1 4 0
6 2 4 4 0
7 3 3 1 1
Thanks
This will work with any chunk size:
>>> chunk, mask = 3, A['D'] == 1
>>> mask -= mask.shift(-chunk).fillna(0)
>>> A[mask[::-1].cumsum() > 0]
A B C D
1 1 5 2 0
2 2 4 4 0
3 3 3 1 1
5 5 1 4 0
6 2 4 4 0
7 3 3 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With