Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Last Observation Carried Forward In a data frame? [duplicate]

I wish to implement a "Last Observation Carried Forward" for a data set I am working on which has missing values at the end of it.

Here is a simple code to do it (question after it):

LOCF <- function(x)
{
    # Last Observation Carried Forward (for a left to right series)
    LOCF <- max(which(!is.na(x))) # the location of the Last Observation to Carry Forward
    x[LOCF:length(x)] <- x[LOCF]
    return(x)
}


# example:
LOCF(c(1,2,3,4,NA,NA))
LOCF(c(1,NA,3,4,NA,NA))

Now this works great for simple vectors. But if I where to try and use it on a data frame:

a <- data.frame(rep("a",4), 1:4,1:4, c(1,NA,NA,NA))
a
t(apply(a, 1, LOCF)) # will make a mess

It will turn my data frame into a character matrix.

Can you think of a way to do LOCF on a data.frame, without turning it into a matrix? (I could use loops and such to correct the mess, but would love for a more elegant solution)

like image 351
Tal Galili Avatar asked May 05 '10 19:05

Tal Galili


People also ask

What carried forward last observation?

Last Observation Carried Forward (LOCF) is a common statistical approach to the analysis of longitudinal repeated measures data where some follow-up observations may be missing.

Why use LOCF?

LOCF is used to maintain the sample size and to reduce the bias caused by the attrition of participants in a study.

What is LOCF in SAS?

The last observation carried forward (LOCF) method is a common way for imputing data with dropouts in clinical trial study. The last non-missing observed value is used to fill in missing values at a later time point.


2 Answers

If you do not want to load a big package like zoo just for the na.locf function, here is a short solution which also works if there are some leading NAs in the input vector.

na.locf <- function(x) {
  v <- !is.na(x)
  c(NA, x[v])[cumsum(v)+1]
}
like image 178
Henrik Seidel Avatar answered Oct 24 '22 04:10

Henrik Seidel


Adding the new tidyr::fill() function for carrying forward the last observation in a column to fill in NAs:

a <- data.frame(col1 = rep("a",4), col2 = 1:4, 
                col3 = 1:4, col4 = c(1,NA,NA,NA))
a
#   col1 col2 col3 col4
# 1    a    1    1    1
# 2    a    2    2   NA
# 3    a    3    3   NA
# 4    a    4    4   NA

a %>% tidyr::fill(col4)
#   col1 col2 col3 col4
# 1    a    1    1    1
# 2    a    2    2    1
# 3    a    3    3    1
# 4    a    4    4    1
like image 43
Prradep Avatar answered Oct 24 '22 04:10

Prradep