I have a similar question to this post (How to find first non-zero element and last non-zero element and TRIM vector)
but I am looking for a dplyr solution, as I want to include a
d %>%
group_by(id) %>%
statement in addition.
I have a data frame:
d<-data.frame(time = factor(c("00:00","00:15","00:30","00:45", "01:00","01:15","01:30","01:45","02:00","02:40" )), q=c(0,0,100,0,0,100,0,0,0,0),p=c(.25,.25,.25,.25,.25,.25,.25,.25,.25,.25))
d
time q p
1 00:00 0 0.25
2 00:15 0 0.25
3 00:30 100 0.25
4 00:45 0 0.25
5 01:00 0 0.25
6 01:15 100 0.25
7 01:30 0 0.25
8 01:45 0 0.25
9 02:00 0 0.25
10 02:40 0 0.25
I would like to eliminate rows of the data frame that are BEFORE the first non-zero index of column "q" AND AFTER the last non-zero index of column "q". In the case above the results should look like this:
00:30 100 0.25
00:45 0 0.25
01:00 0 0.25
01:15 100 0.25
@akrun gave a solution to this question:
indx <- which(d$q!=0)
d[indx[1L]:indx[length(indx)],]
This works, but I am looking for a dplyr solution, as I want to perform this calculation across multiple groups.
One option could be:
d %>%
filter(row_number() %in% Reduce(`:`, which(q != 0)))
time q p
1 00:30 100 0.25
2 00:45 0 0.25
3 01:00 0 0.25
4 01:15 100 0.25
d<-data.frame(time = factor(c("00:00","00:15","00:30","00:45", "01:00","01:15","01:30","01:45","02:00","02:40" )), q=c(0,0,100,0,0,100,0,0,0,0),p=c(.25,.25,.25,.25,.25,.25,.25,.25,.25,.25))
library(dplyr, warn.conflicts = F)
d %>% filter(cumsum(q != 0) != 0) %>%
filter(rev(cumsum(rev(q != 0))) != 0 )
#> time q p
#> 1 00:30 100 0.25
#> 2 00:45 0 0.25
#> 3 01:00 0 0.25
#> 4 01:15 100 0.25
Created on 2021-06-12 by the reprex package (v2.0.0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With