I have a data frame e.g.:
sub day 1 1 1 2 1 3 1 4 2 1 2 2 2 3 2 4 3 1 3 2 3 3 3 4
and I would like to remove specific rows that can be identified by the combination of sub and day. For example say I wanted to remove rows where sub='1' and day='2' and sub=3 and day='4'. How could I do this? I realise that I could specify the row numbers, but this needs to be applied to a huge dataframe which would be tedious to go through and ID each row.
To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.
Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).
DF[ ! ( ( DF$sub ==1 & DF$day==2) | ( DF$sub ==3 & DF$day==4) ) , ] # note the ! (negation)
Or if sub is a factor as suggested by your use of quotes:
DF[ ! paste(sub,day,sep="_") %in% c("1_2", "3_4"), ]
Could also use subset:
subset(DF, ! paste(sub,day,sep="_") %in% c("1_2", "3_4") )
(And I endorse the use of which
in Dirk's answer when using "[" even though some claim it is not needed.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With