Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pad selection range in Pandas Dataframe?

If I slice a dataframe with something like

>>> df = pd.DataFrame(data=[[x] for x in [1,2,3,5,1,3,2,1,1,4,5,6]], columns=['A'])

>>> df.loc[df['A'] == 1]
# or
>>> df[df['A'] == 1]

   A
0  1
4  1
7  1
8  1

how could I pad my selections by a buffer of 1 and get the each of the indices 0, 1, 3, 4, 5, 6, 7, 8, 9? I want to select all rows for which the value in column 'A' is 1, but also a row before or after any such row.


edit I'm hoping to figure out a solution that works for arbitrary pad sizes, rather than just for a pad size of 1.


edit 2 here's another example illustrating what I'm going for

df = pd.DataFrame(data=[[x] for x in [1,2,3,5,3,2,1,1,4,5,6,0,0,3,1,2,4,5]], columns=['A'])

and we're looking for pad == 2. In this case I'd be trying to fetch rows 0, 1, 2, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16.

like image 443
RagingRoosevelt Avatar asked Dec 17 '22 11:12

RagingRoosevelt


2 Answers

you can use shift with bitwise or |

c = df['A'] == 1
df[c|c.shift()|c.shift(-1)]

   A
0  1
1  2
3  5
4  1
5  3
6  2
7  1
8  1
9  4
like image 177
anky Avatar answered Jan 01 '23 11:01

anky


For arbitrary pad sizes, you may try where, interpolate, and notna to create the mask

n = 2
c = df.where(df['A'] == 1)
m = c.interpolate(limit=n, limit_direction='both').notna()
df[m]

Out[61]:
    A
0   1
1   2
2   3
4   3
5   2
6   1
7   1
8   4
9   5
12  0
13  3
14  1
15  2
16  4
like image 40
Andy L. Avatar answered Jan 01 '23 12:01

Andy L.