If I have a pandas data frame like this made up of 0 and 1s:
1 1 1 0 0 0 0 1 0
1 1 1 1 1 0 0 0 0
1 1 1 0 0 0 0 1 0
1 0 0 0 0 1 0 0 0
How do I filter out outliers such that I get something like this:
1 1 1 0 0 0 0 0 0
1 1 1 1 1 0 0 0 0
1 1 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0
Such that I remove the outliers.
We can do this with a cummulative product over the second axis with pandas.cumprod
[pandas-doc]:
>>> df.cumprod(axis=1)
0 1 2 3 4 5 6 7 8
0 1 1 1 0 0 0 0 0 0
1 1 1 1 1 1 0 0 0 0
2 1 1 1 0 0 0 0 0 0
3 1 0 0 0 0 0 0 0 0
The same result can here be obtained with pandas.cummin
[pandas-doc]:
>>> df.cummin(axis=1)
0 1 2 3 4 5 6 7 8
0 1 1 1 0 0 0 0 0 0
1 1 1 1 1 1 0 0 0 0
2 1 1 1 0 0 0 0 0 0
3 1 0 0 0 0 0 0 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With