Conditionally drop Pandas Dataframe row

Question

I wish to drop rows where the rows just before and just after has the same value for the column num2. My dataframe looks like this:

import pandas as pd

df = pd.DataFrame([
    [12, 10],
    [11, 10],
    [13, 10],
    [42, 11],
    [4, 11],
    [5, 2]
], columns=["num1", "num2"]
)

And this is my target:

df = pd.DataFrame([
    [12, 10],
    [13, 10],
    [42, 11],
    [4, 11],
    [5, 2]
], columns=["num1", "num2"]
)

What I have tried:

df["num1_diff"] = df["num2"].diff().fillna(0).astype(int)
filt = df["num1_diff"].apply(lambda x: x == 0)
print(df[filt])

Giving:

   num1  num2  num1_diff
0    12    10          0
1    11    10          0
2    13    10          0
4     4    11          0

And I was thinking to use the new num1_diff column to do the filtering. Is this a good approach, or is there perhaps a better one?

Erfan · Accepted Answer

Use Series.shift twice, and check where num2 equals:

df[df['num2'].shift().ne(df['num2'].shift(-1))]

   num1  num2
0    12    10
2    13    10
3    42    11
4     4    11
5     5     2

Conditionally drop Pandas Dataframe row

Tags:

python

python-3.x

pandas

dataframe

Gustav Rasmussen

1 Answers

Erfan

Recent Activity

Donate For Us

Conditionally drop Pandas Dataframe row

Tags:

python

python-3.x

pandas

dataframe

Gustav Rasmussen

1 Answers

Erfan

Related questions

Recent Activity

Donate For Us