Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select last non-zero value before each run of zero, by group

Tags:

dataframe

r

dplyr

I have the following data frame:

   variable       Date values values2
1         a 2017-01-01      3       1
2         a 2017-01-02      4       2
3         a 2017-01-03      5       1 # non-zero followed by zero
4         a 2017-01-04      0       2
5         a 2017-01-05      0       2
6         a 2017-01-06      0       3
7         b 2017-01-01     10       1
8         b 2017-01-02     11       2
9         b 2017-01-03     12       2
10        b 2017-01-04     13       3
11        b 2017-01-05     14       0
12        b 2017-01-06     15       1
13        c 2017-01-01     45       2
14        c 2017-01-02     50       3 # non-zero followed by zero
15        c 2017-01-03      0       0
16        c 2017-01-04      0       2
17        c 2017-01-05     10       1 # non-zero followed by zero
18        c 2017-01-06      0       1

I want select the rows where a non-zero value is followed by a zero in the "values" column:

1 a        2017-01-03      5       1
2 c        2017-01-02     50       3
3 c        2017-01-05     10       1

Please note that more than one last non-zero value may appear in one variable.

My code does not work correctly, because there is only the last element for each variable:

test_df <- structure(list(variable = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "c", "c", "c", "c", "c","c"), 
                          Date = structure(c(17167, 17168, 17169, 17170, 17171, 17172, 17167, 17168, 17169, 17170, 17171, 17172, 17167, 17168, 17169, 17170, 17171, 17172), class = "Date"), 
                          values = c(3, 4, 5, 0, 0, 0, 10, 11, 12, 13, 14, 15, 45, 50,0, 0, 10, 0),
                          values2 = c(1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 0, 1, 2, 3, 0, 2, 1, 1)), 
                          row.names = c(NA, -18L), class = "data.frame", .Names = c("variable", "Date", "values", "values2"))

 test_df %>% 
  group_by(variable=factor(variable)) %>% 
  filter(any(values==0)) %>% 
  filter(values != 0) %>% 
  arrange(Date) %>% 
  slice(n()) %>% 
  ungroup() 

1 a        2017-01-03      5       1
2 c        2017-01-05     10       1
like image 893
Zizou Avatar asked Dec 02 '22 08:12

Zizou


1 Answers

This solution could also help you with this:

library(dplyr)

test_df %>%
  group_by(variable) %>%
  filter(values != 0 & lead(values) == 0)

  variable       Date values values2
1        a 2017-01-03      5       1
2        c 2017-01-02     50       3
3        c 2017-01-05     10       1
like image 87
Anoushiravan R Avatar answered Dec 03 '22 23:12

Anoushiravan R