I have the following dataset:
df = pd.DataFrame(np.random.rand(50,2), columns=list('AB'))
plot data
plt.scatter(x=df.A, y=df.B)
x = plt.axhline(y=0.4,c='k')
y = plt.axvline(x=0.4,c='k')
plt.plot([0.2, 0.3], [0, 0.4], c='k')
I want to select the points in the green areas(see graph below). The points in the second quadrant were easy to select but not the points in the green area in the third quadrant.
This is how i selected points in the second quadrants:
df[( df['A'] < 0.4) & (df['B'] > 0.4)]
after this I got stuck.
Considering the conditions might get complex, like dealing with curved lines etc. What is the best way to tackle this problem?
Open for any suggestions.
I suggest you can use functools:
import numpy as np
import functools
cr1 = functools.reduce(np.logical_and, [df.B < 0.4, df.A < 0.2])
cr2 = functools.reduce(np.logical_and, [df.B < 0.4, df.A > 0.2, df.B > (df.A-0.2)*4])
df_filtered = df[functools.reduce(np.logical_or, [cr1,cr2])]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With