I wish to drop rows where the rows just before and just after has the same value for the column num2
.
My dataframe looks like this:
import pandas as pd
df = pd.DataFrame([
[12, 10],
[11, 10],
[13, 10],
[42, 11],
[4, 11],
[5, 2]
], columns=["num1", "num2"]
)
And this is my target:
df = pd.DataFrame([
[12, 10],
[13, 10],
[42, 11],
[4, 11],
[5, 2]
], columns=["num1", "num2"]
)
What I have tried:
df["num1_diff"] = df["num2"].diff().fillna(0).astype(int)
filt = df["num1_diff"].apply(lambda x: x == 0)
print(df[filt])
Giving:
num1 num2 num1_diff
0 12 10 0
1 11 10 0
2 13 10 0
4 4 11 0
And I was thinking to use the new num1_diff
column to do the filtering.
Is this a good approach, or is there perhaps a better one?
Use Series.shift
twice, and check where num2
equals:
df[df['num2'].shift().ne(df['num2'].shift(-1))]
num1 num2
0 12 10
2 13 10
3 42 11
4 4 11
5 5 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With