I have a tibble with a column called meanSR_strong
and another called meanSR_weak
. If there are 10 or more consecutive NAs in the meanSR_strong
column, I would like to replace the values with values from the meanSR_weak
column, even if those replaced values are also NA. If there are under consecutive NAs in the meanSR_strong
column, then I don't need to do any replacing.
For example, rows 3-6 are all NA, but that is only four consecutive, so it doesn't matter. However rows 15-28 are all NA (and that is more than 10 in a row), so I want to sub in values from the meanSR_weak
column.
I know how to replace all the NAs, but I haven't figured out a nice way of coding this!
Here is my data
x=structure(list(meanSR_strong = c(NA, 0.376009009009009, NA, NA,
NA, NA, 0.615585585585586, NA, 0.607354054054054, 0.590210810810811,
0.57005045045045, 0.596616216216216, 0.584066666666667, 0.538597297297297,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.639010810810811,
0.634272972972973), meanSR_weak = c(0.574724324324324, 0.562030630630631,
0.586247747747748, NA, NA, NA, 0.615585585585586, NA, 0.607354054054054,
0.590210810810811, 0.57005045045045, 0.596616216216216, 0.608510810810811,
0.538597297297297, NA, NA, NA, 0.555463063063063, 0.376715315315315,
NA, NA, NA, NA, NA, NA, 0.60972972972973, NA, NA, 0.639010810810811,
0.634272972972973), cloud.pct_strong = c(100, 36.036036036036,
98.1981981981982, 100, 100, 100, 0, 100, 0, 0, 0, 0, 3.6036036036036,
0, NA, NA, 100, 67.5675675675676, 100, 100, NA, 100, 100, 100,
100, 74.7747747747748, 100, 100, 0, 0), cloud.pct_weak = c(0,
0, 0, 100, 100, 100, 0, 100, 0, 0, 0, 0, 0, 0, NA, NA, 100, 0,
36.036036036036, 67.5675675675676, NA, 100, 100, 100, 100, 0.900900900900901,
100, 60.3603603603604, 0, 0), date = structure(c(951868800, 951955200,
952041600, 952128000, 952214400, 952300800, 952387200, 952473600,
952560000, 952646400, 952732800, 952819200, 952905600, 952992000,
953078400, 953164800, 953251200, 953337600, 953424000, 953510400,
953596800, 953683200, 953769600, 953856000, 953942400, 954028800,
954115200, 954201600, 954288000, 954374400), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), .Names = c("meanSR_strong", "meanSR_weak",
"cloud.pct_strong", "cloud.pct_weak", "date"), row.names = c(NA,
-30L), class = c("tbl_df", "tbl", "data.frame"))
The R rle function can be used for this. First build an rle-list ("values" and "lengths", see ?rle
) of the is.na
-values:
z <- rle(is.na(x$meanSR_strong))
Then change the z$values entries from TRUE to FALSE when the run of NA's is less than some length that you choose. Here I choose 10:
z$values[z$lengths <10& z$values==TRUE] <- FALSE
Then reconstruct a logical vector for indexing with the [<-
function using the rep
-function which is essentially an inverse of rle
:
x [ rep( z$values, z$lengths), "meanSR_strong"] <-
x[ rep( z$values, z$lengths), "meanSR_weak"]
print(x, n=30)
# A tibble: 30 x 5
meanSR_strong meanSR_weak cloud.pct_strong cloud.pct_weak date
<dbl> <dbl> <dbl> <dbl> <dttm>
1 NA 0.5747243 100.000000 0.0000000 2000-03-01
2 0.3760090 0.5620306 36.036036 0.0000000 2000-03-02
3 NA 0.5862477 98.198198 0.0000000 2000-03-03
4 NA NA 100.000000 100.0000000 2000-03-04
5 NA NA 100.000000 100.0000000 2000-03-05
6 NA NA 100.000000 100.0000000 2000-03-06
7 0.6155856 0.6155856 0.000000 0.0000000 2000-03-07
8 NA NA 100.000000 100.0000000 2000-03-08
9 0.6073541 0.6073541 0.000000 0.0000000 2000-03-09
10 0.5902108 0.5902108 0.000000 0.0000000 2000-03-10
11 0.5700505 0.5700505 0.000000 0.0000000 2000-03-11
12 0.5966162 0.5966162 0.000000 0.0000000 2000-03-12
13 0.5840667 0.6085108 3.603604 0.0000000 2000-03-13
14 0.5385973 0.5385973 0.000000 0.0000000 2000-03-14
15 NA NA NA NA 2000-03-15
16 NA NA NA NA 2000-03-16
17 NA NA 100.000000 100.0000000 2000-03-17
18 0.5554631 0.5554631 67.567568 0.0000000 2000-03-18
19 0.3767153 0.3767153 100.000000 36.0360360 2000-03-19
20 NA NA 100.000000 67.5675676 2000-03-20
21 NA NA NA NA 2000-03-21
22 NA NA 100.000000 100.0000000 2000-03-22
23 NA NA 100.000000 100.0000000 2000-03-23
24 NA NA 100.000000 100.0000000 2000-03-24
25 NA NA 100.000000 100.0000000 2000-03-25
26 0.6097297 0.6097297 74.774775 0.9009009 2000-03-26
27 NA NA 100.000000 100.0000000 2000-03-27
28 NA NA 100.000000 60.3603604 2000-03-28
29 0.6390108 0.6390108 0.000000 0.0000000 2000-03-29
30 0.6342730 0.6342730 0.000000 0.0000000 2000-03-30
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With