Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fill NA's at boundary of a vector in r

I have a vector containing NA's at the boundary

x <- c(NA, -1, 1,-1, 1, NA, -1, 2, NA, NA)

I want the outcome to be:

c(-3, -1, 1,-1, 1, 0, -1, 2, 5, 8)

In other words, I want to fill both inner and boundary NA's with linear interpolation (maybe I cannot call it "inter-polation" since NA's are at boundaries).

I tried a function in the Package "zoo", na.fill(x, "extend"), but the boundary output is not something I want, which just repeats the leftmost or rightmost non-NA value:

na.fill(x,"extend")

and the output is

[1] -1 -1  1 -1  1  0 -1  2  2  2

I also checked other functions for filling NA, such as na.approx(), na.locf(), etc. but none of them works.

na.spline does work but the output of boundary NA's lead to an extremely large variation.

na.spline(x)

The output is:

 [1] -15.9475983  -1.0000000   1.0000000  -1.0000000   1.0000000   0.3400655  -1.0000000   2.0000000
 [9]  13.1441048  35.9323144

The boundary points are too large. Can anyone help me out? Thanks in advance!

like image 623
Hongfei Li Avatar asked Dec 23 '22 20:12

Hongfei Li


2 Answers

You can use na.spline() from the zoo library:

na.spline(x)

[1] 0.0 0.5 1.0 1.5 2.0 2.5

Data for the original question:

x <- c(0, NA, 1, NA, 2, NA)
like image 146
tmfmnk Avatar answered Dec 25 '22 08:12

tmfmnk


Given the data and expected output after the question's edit, I believe the following function does it. It fills in the interior NA's with approxfun and then treats the extremes one by one.

na.extrapol <- function(y){
  x <- seq_along(y)
  f <- approxfun(x[!is.na(y)], y[!is.na(y)])
  y[is.na(y)] <- f(x[is.na(y)])
  r <- rle(is.na(y))
  if(r$values[1]){
    Y <- y[r$lengths[1] + 1:2]
    X <- seq_len(r$lengths[1])
    y[rev(X)] <- Y[1] - diff(Y)*X
  }
  n <- length(r$lengths)
  if(r$values[n]){
    s <- sum(r$lengths[-n])
    Y <- y[s - 1:0]
    X <- seq_len(r$lengths[n])
    y[s + X] <- Y[2] + diff(Y)*X
  }
  y
}

x <- c(NA, -1, 1,-1, 1, NA, -1, 2, NA, NA)
na.extrapol(x)
#[1] -3 -1  1 -1  1  0 -1  2  5  8

x2 <- c(NA, NA, -1, 1,-1, 1, NA, -1, 2, NA, NA)
na.extrapol(x2)
#[1] -5 -3 -1  1 -1  1  0 -1  2  5  8
like image 42
Rui Barradas Avatar answered Dec 25 '22 08:12

Rui Barradas