Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

propagating data within a vector

Tags:

r

vector

I'm learning R and I'm curious... I need a function that does this:

> fillInTheBlanks(c(1, NA, NA, 2, 3, NA, 4))
[1] 1 1 1 2 3 3 4
> fillInTheBlanks(c(1, 2, 3, 4))
[1] 1 2 3 4

and I produced this one... but I suspect there's a more R way to do this.

fillInTheBlanks <- function(v) {
  ## replace each NA with the latest preceding available value

  orig <- v
  result <- v
  for(i in 1:length(v)) {
    value <- v[i]
    if (!is.na(value))
      result[i:length(v)] <- value
  }
  return(result)
}
like image 779
mariotomo Avatar asked Nov 23 '09 11:11

mariotomo


3 Answers

Package zoo has a function na.locf():

R> library("zoo")
R> na.locf(c(1, 2, 3, 4))
[1] 1 2 3 4
R> na.locf(c(1, NA, NA, 2, 3, NA, 4))
[1] 1 1 1 2 3 3 4

na.locf: Last Observation Carried Forward; Generic function for replacing each ‘NA’ with the most recent non-‘NA’ prior to it.

See the source code of the function na.locf.default, it doesn't need a for-loop.

like image 94
rcs Avatar answered Nov 07 '22 09:11

rcs


I'm doing some minimal copy&paste from the zoo library (thanks again rcs for pointing me at it) and this is what I really needed:

fillInTheBlanks <- function(S) {
  ## NA in S are replaced with observed values

  ## accepts a vector possibly holding NA values and returns a vector
  ## where all observed values are carried forward and the first is
  ## also carried backward.  cfr na.locf from zoo library.
  L <- !is.na(S)
  c(S[L][1], S[L])[cumsum(L)+1]
}
like image 36
mariotomo Avatar answered Nov 07 '22 08:11

mariotomo


Just for fun (since it's slower than fillInTheBlanks), here's a version of na.locf relying on rle function:

my.na.locf <- function(v,fromLast=F){
  if(fromLast){
    return(rev(my.na.locf(rev(v))))
  }
  nas <- is.na(v)
  e <- rle(nas)
  v[nas] <- rep.int(c(NA,v[head(cumsum(e$lengths),-1)]),e$lengths)[nas]
  return(v)
}

e.g.

v1 <- c(3,NA,NA,NA,1,2,NA,NA,5)
v2 <- c(NA,NA,NA,1,7,NA,NA,5,NA)

my.na.locf(v1)
#[1] 3 3 3 3 1 2 2 2 5

my.na.locf(v2)
#[1] NA NA NA  1  7  7  7  5  5

my.na.locf(v1,fromLast=T)
#[1] 3 1 1 1 1 2 5 5 5

my.na.locf(v2,fromLast=T)
#[1]  1  1  1  1  7  5  5  5 NA
like image 2
digEmAll Avatar answered Nov 07 '22 09:11

digEmAll