Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing specific rows from a dataframe

Tags:

I have a data frame e.g.:

sub   day 1      1 1      2 1      3 1      4 2      1 2      2 2      3 2      4 3      1 3      2 3      3 3      4 

and I would like to remove specific rows that can be identified by the combination of sub and day. For example say I wanted to remove rows where sub='1' and day='2' and sub=3 and day='4'. How could I do this? I realise that I could specify the row numbers, but this needs to be applied to a huge dataframe which would be tedious to go through and ID each row.

like image 632
ThallyHo Avatar asked Aug 18 '11 19:08

ThallyHo


People also ask

How do I delete multiple rows in a data frame?

To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.

How do I delete rows in Pandas Dataframe based on condition?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).


1 Answers

DF[ ! ( ( DF$sub ==1 & DF$day==2) | ( DF$sub ==3 & DF$day==4) ) , ]   # note the ! (negation) 

Or if sub is a factor as suggested by your use of quotes:

DF[ ! paste(sub,day,sep="_") %in% c("1_2", "3_4"), ] 

Could also use subset:

subset(DF,  ! paste(sub,day,sep="_") %in% c("1_2", "3_4") ) 

(And I endorse the use of which in Dirk's answer when using "[" even though some claim it is not needed.)

like image 109
IRTFM Avatar answered Oct 02 '22 15:10

IRTFM