Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas delete first n rows until condition on columns is fulfilled

I am trying to delete some rows from my dataframe. In fact I want to delete the the first n rows, while n should be the row number of a certain condition. I want the dataframe to start with the row that contains the x-y values xEnd,yEnd. All earlier rows shall be dropped from the dataframe. Somehow I do not get the solution. That is what i have so far.

Example:

import pandas as  pd
xEnd=2
yEnd=3
df = pd.DataFrame({'x':[1,1,1,2,2,2], 'y':[1,2,3,3,4,3], 'id':[0,1,2,3,4,5]})
n=df["id"].iloc[df["x"]==xEnd and df["y"]==yEnd]
df = df.iloc[n:]

I want my code to reduce the dataframe from

{'x':[1,1,1,2,2,2], 'y':[1,2,3,3,4,3], 'id':[0,1,2,3,4,5]}

to

{'x':[2,2,2], 'y':[3,4,3], 'id':[3,4,5]}
like image 317
Mauritius Avatar asked Oct 20 '18 15:10

Mauritius


2 Answers

  • Use & instead of and
  • Use loc instead of iloc. You can use iloc but it could break depending on the index
  • Use idxmax to find the first positiopn

#             I used idxmax to find the index |
#                                             v
df.loc[((df['x'] == xEnd) & (df['y'] == yEnd)).idxmax():]
# ^
# | finding the index goes with using loc

   id  x  y
3   3  2  3
4   4  2  4
5   5  2  3

Here is an iloc variation

#    I used values.argmax to find the position |
#                                              v
df.iloc[((df['x'] == xEnd) & (df['y'] == yEnd)).values.argmax():]
# ^
# | finding the position goes with using iloc

   id  x  y
3   3  2  3
4   4  2  4
5   5  2  3
like image 198
piRSquared Avatar answered Oct 16 '22 09:10

piRSquared


Using cummax

df[((df['x'] == xEnd) & (df['y'] == yEnd)).cummax()]
Out[147]: 
   id  x  y
3   3  2  3
4   4  2  4
5   5  2  3
like image 36
BENY Avatar answered Oct 16 '22 11:10

BENY