If I slice a dataframe with something like
>>> df = pd.DataFrame(data=[[x] for x in [1,2,3,5,1,3,2,1,1,4,5,6]], columns=['A'])
>>> df.loc[df['A'] == 1]
# or
>>> df[df['A'] == 1]
   A
0  1
4  1
7  1
8  1
how could I pad my selections by a buffer of 1 and get the each of the indices 0, 1, 3, 4, 5, 6, 7, 8, 9?  I want to select all rows for which the value in column 'A' is 1, but also a row before or after any such row.
edit I'm hoping to figure out a solution that works for arbitrary pad sizes, rather than just for a pad size of 1.
edit 2 here's another example illustrating what I'm going for
df = pd.DataFrame(data=[[x] for x in [1,2,3,5,3,2,1,1,4,5,6,0,0,3,1,2,4,5]], columns=['A'])
and we're looking for pad == 2. In this case I'd be trying to fetch rows 0, 1, 2, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16.
you can use shift with bitwise or |
c = df['A'] == 1
df[c|c.shift()|c.shift(-1)]
   A
0  1
1  2
3  5
4  1
5  3
6  2
7  1
8  1
9  4
                        For arbitrary pad sizes, you may try where, interpolate, and notna to create the mask
n = 2
c = df.where(df['A'] == 1)
m = c.interpolate(limit=n, limit_direction='both').notna()
df[m]
Out[61]:
    A
0   1
1   2
2   3
4   3
5   2
6   1
7   1
8   4
9   5
12  0
13  3
14  1
15  2
16  4
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With