I want to delete specific 'n' number of rows from a dataframe, where the rows to be deleted are chosen randomly. Also, it must select the rows based on a condition on particular column values.
For example, my dataframe is as below:
C1 C2 C3
1 0 a
2 1 b
3 0 c
4 0 d
5 0 e
6 1 f
7 1 g
8 1 h
9 0 i
Now, I want to remove n=2 rows randomly, that has a condition where C2==1.
The resultant frame can be as below:
C1 C2 C3
1 0 a
3 0 c
4 0 d
5 0 e
6 1 f
8 1 h
9 0 i
or
C1 C2 C3
1 0 a
2 1 b
3 0 c
4 0 d
5 0 e
7 1 g
9 0 i
or maybe other possibles too. The question here dows shows to remove 'n' sentences randomly, but it doesn't include providding the condition.
Filter rows by boolean indexing with DataFrame.sample for random rows, last use drop:
N = 2
df1 = df.drop(df[df['C2'].eq(1)].sample(N).index)
print (df1)
C1 C2 C3
0 1 0 a
1 2 1 b
2 3 0 c
3 4 0 d
4 5 0 e
6 7 1 g
8 9 0 i
Or use np.random.choice for random index values:
df = df.drop(np.random.choice(df.index[df['C2'].eq(1)], N))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With