I am trying to solve the following problem. I have a tibble:
> tibble( signal = c(0,1,0,0,1,0,0,1,1,1,1,1,1,0), days =0)
# A tibble: 14 x 2
signal days
<dbl> <dbl>
1 0 0
2 1 0
3 0 0
4 0 0
5 1 0
6 0 0
7 0 0
8 1 0
9 1 0
10 1 0
11 1 0
12 1 0
13 1 0
14 0 0
I need to fill days column the following way:
So, the result will look like:
signal days
<dbl> <dbl>
1 0 0
2 1 1
3 0 2
4 0 3
5 1 4
6 0 0
7 0 0
8 1 1
9 1 2
10 1 3
11 1 4
12 1 1
13 1 2
14 0 3
I can do it using for loop but having a hard time doing it vectorized preferably using dplyr.
Appreciate any help!
Here is something basic with data.table::set()
library(data.table)
i <- 1L
n <- nrow(df)
while (i < n) {
if (df$signal[i] == 1) {
k <- min(i+3L, n)
set(df, i = (i:k), j = "days", 1L:(k-i+1L))
i <- i+4L
} else {
i <- i+1L
}
}
# signal days
# 1 0 0
# 2 1 1
# 3 0 2
# 4 0 3
# 5 1 4
# 6 0 0
# 7 0 0
# 8 1 1
# 9 1 2
# 10 1 3
# 11 1 4
# 12 1 1
# 13 1 2
# 14 0 3
Here's an Rcpp
solution. Although this contains a loop, this has a very low overhead compared to R based loops, and is likely about as quick as you are going to get:
Rcpp::cppFunction("IntegerVector fill_column(IntegerVector v) {
bool flag = false;
int counter = 1;
for(int i = 0; i < v.length(); ++i) {
if(flag){
v[i] = counter++;
if(counter == 5) {
flag = false;
counter = 1;
}
} else {
if(v[i] == 1) {
v[i] = counter++;
flag = true;
}
}
}
return v;
}")
This allows you to use the function inside dplyr:
df %>% mutate(days = fill_column(signal))
##> A tibble: 14 x 2
#> signal days
#> <dbl> <int>
#> 1 0 0
#> 2 1 1
#> 3 0 2
#> 4 0 3
#> 5 1 4
#> 6 0 0
#> 7 0 0
#> 8 1 1
#> 9 1 2
#> 10 1 3
#> 11 1 4
#> 12 1 1
#> 13 1 2
#> 14 0 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With