I have a dataframe and an array like this:
df
x y z
1 10 1
10 20 2
20 30 3
30 40 4
40 50 5
my_array= 5 35 36 40 41 45 46 47 48
How could I iterate over the dataframe so that, rows will be kept if my_array
exist between x
and y
. The final df would be:
x y z
1 10 1
30 40 4
40 50 5
I have tried df=df[(my_array <= df['x']) and (df['y'] <= my_array)]
But It gives value error; Lengths must match to compare
.
The length my my_array is larger than number of rows. Any help?
df[((df['x'].values[:, None] <= my_array) &
(df['y'].values[:, None] >= my_array)).any(1)]
x y z
0 1 10 1
3 30 40 4
4 40 50 5
No need to iterate, we can use numpy broadcasting (which can be memory heavy for large datasets):
idx = np.where(
(df["x"].to_numpy()[:, None] <= my_array) &
(df["y"].to_numpy()[:, None] >= my_array)
)[0]
df.iloc[np.unique(idx)]
x y z
0 1 10 1
3 30 40 4
4 40 50 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With