Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python PANDAS: Drop All Rows After First Occurrence of Column Value

Tags:

python

pandas

I have a PANDAS dataframe with a columns with an open/closed status value and a ranking field value. After I sort by the rank field, what would be the best way to drop/delete all rows after the first occurrence of an "open" value? I'm just confused if I should take an iterator function or standard index based approach with PANDAS. Any advice would be great!

Edit: This is just what I have started with thus far

df["Rank", "Status"].sort_values(by="Rank")

The output I am trying to accomplish would look like the following:

From this:

Rank Status
1    Closed
5    Closed
6    Open
9    Closed
10   Open

To this:

Rank Status
 1    Closed
 5    Closed
 6    Open
like image 520
Pylander Avatar asked Dec 08 '15 18:12

Pylander


Video Answer


2 Answers

You can simply reindex the data frame when you sort it and then find the index location of the first instance of 'Open' and slice the data to there....

import pandas as pd
#create dataframe
df = pd.DataFrame({
'Rank' : [5, 1, 10 ,6, 9],   
'Status' : ['Closed', 'Closed', 'Open', 'Closed', 'Open']
})

# sort and reindex
df = df.sort('Rank').reset_index()

#slice to first occurrence of your value
df.loc[: df[(df['Status'] == 'Open')].index[0], :]
like image 160
Woody Pride Avatar answered Sep 18 '22 12:09

Woody Pride


Almost the same answer. Manipulating df directly.

df = df[:df[df['Status'] == 'Open'].index[0]]

This will return the index of the first instance of the value and then slice the DataFrame up to that row.

like image 31
sparrow Avatar answered Sep 19 '22 12:09

sparrow