iterate over rows to check if values of an array exists between two columns

Question

I have a dataframe and an array like this:

df
x y z
1 10 1
10 20 2
20 30 3
30 40 4
40 50 5

my_array= 5 35 36 40 41 45 46 47 48

How could I iterate over the dataframe so that, rows will be kept if my_array exist between x and y . The final df would be:

I have tried df=df[(my_array <= df['x']) and (df['y'] <= my_array)]

But It gives value error; Lengths must match to compare.

The length my my_array is larger than number of rows. Any help?

Shubham Sharma · Accepted Answer

df[((df['x'].values[:, None] <= my_array) &
    (df['y'].values[:, None] >= my_array)).any(1)]

    x   y  z
0   1  10  1
3  30  40  4
4  40  50  5

Erfan · Answer

No need to iterate, we can use numpy broadcasting (which can be memory heavy for large datasets):

idx = np.where(
    (df["x"].to_numpy()[:, None] <= my_array) & 
    (df["y"].to_numpy()[:, None] >= my_array)
)[0]

df.iloc[np.unique(idx)]

    x   y  z
0   1  10  1
3  30  40  4
4  40  50  5

Donate For Us