I'd like to remove the rows that got more than 3 consecutive NA
s in one column.
[,1] [,2]
[1,] 1 1
[2,] NA 1
[3,] 2 4
[4,] NA 3
[6,] 1 4
[7,] NA 8
[8,] NA 5
[9,] NA 6
so I'd have this data
[,1] [,2]
[1,] 1 1
[2,] NA 1
[3,] 2 4
[4,] NA 3
[6,] 1 4
I did a research and I tried this code
data[! rowSums(is.na(data)) >3 , ]
but I think this is only used for consecutive NA
s in a row.
As mentioned, rle
is a good place to start:
is.na.rle <- rle(is.na(data[, 1]))
Since NAs are "bad" only when they come by three or more, we can re-write the values:
is.na.rle$values <- is.na.rle$values & is.na.rle$lengths >= 3
Finally, use inverse.rle
to build the vector of indices to filter:
data[!inverse.rle(is.na.rle), ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With