Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conditionally remove rows from dataframe (more than one conditions)

I have searched SO and although there are many QA about conditionally removing rows none of the QA fit my problem.

I have a data.frame containing longitudinal measurements of variable x, y etc... , at various time points time, in several subjects id. Some subjects experience an event ev (denoted as 1, otherwise 0 at some time). I would like to reduce the initial data.frame to:

  • 1) All rows with subjects that have not experienced an event (ok, thats easy) but also include
  • 2) For the subjects that have experienced an event, all rows just prior to the event (that is all rows whith times less that the time of the event of that individual).

so that,

testdf<-data.frame(id=c(rep("A",4),rep("B",4),rep("C",4) ),
                   x=c(NA, NA, 1,2, 3, NA, NA, 1, 2, NA,NA, 5), 
                   y=rev(c(NA, NA, 1,2, 3, NA, NA, 1, 2, NA,NA, 5)),
                   time=c(1,2,3,4,0.1,0.5,10,20,3,2,1,0.5),
                   ev=c(0,0,0,0,0,1,0,0,0,0,0,1))

would reduce to

   id  x  y time ev
1   A NA  5  1.0  0
2   A NA NA  2.0  0
3   A  1 NA  3.0  0
4   A  2  2  4.0  0
5   B  3  1  0.1  0
6   C  2  2  3.0  0
7   C NA  1  2.0  0
8   C NA NA  1.0  0
like image 871
ECII Avatar asked Jan 26 '13 14:01

ECII


People also ask

How do I delete rows in pandas based on multiple conditions?

Pandas provide data analysts a way to delete and filter data frame using dataframe. drop() method. We can use this method to drop such rows that do not satisfy the given conditions.

How do I delete rows from multiple conditions?

To remove rows of data from a dataframe based on multiple conditional statements. We use square brackets [ ] with the dataframe and put multiple conditional statements along with AND or OR operator inside it. This slices the dataframe and removes all the rows that do not satisfy the given conditions.

How do I delete rows with certain conditions?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).

How do I remove rows from a DataFrame based on conditions in R?

For example, we can use the subset() function if we want to drop a row based on a condition. If we prefer to work with the Tidyverse package, we can use the filter() function to remove (or select) rows based on values in a column (conditionally, that is, and the same as using subset).


1 Answers

Here's a solution with subset and ave:

subset(testdf, !ave(ev, id, FUN = cumsum))
like image 114
Sven Hohenstein Avatar answered Sep 17 '22 15:09

Sven Hohenstein