data.table fill missing values from other rows by group

Question

# have
> aDT <- data.table(colA = c(1,1,1,1,2,2,2,2,3,3,3,3), colB = c(4,NA,NA,1,4,3,NA,NA,4,NA,2,NA))
> aDT
    colA colB
 1:    1    4
 2:    1   NA
 3:    1   NA
 4:    1    1
 5:    2    4
 6:    2    3
 7:    2   NA
 8:    2   NA
 9:    3    4
10:    3   NA
11:    3    2
12:    3   NA
# want
> bDT <- data.table(colA = c(1,1,1,1,2,2,2,2,3,3,3,3), colB = c(4,1,1,1,4,3,3,3,4,2,2,2))
> bDT
    colA colB
 1:    1    4
 2:    1    1
 3:    1    1
 4:    1    1
 5:    2    4
 6:    2    3
 7:    2    3
 8:    2    3
 9:    3    4
10:    3    2
11:    3    2
12:    3    2

Would like to fill missing values according to the algorithm below: within each group ('colA'),

use the value from one row below, if it's still NA, keeps going until the last row within that group
if all NAs in rows below, look at rows above (go up 1 row at a time)
if all NAs, then NA

Since the dataset is quite large, algorithmic efficiency is part of consideration. Not sure if there's any package for this type of operation already. How to do it?

tyluRp · Accepted Answer

With data.table and zoo:

library(data.table)
library(zoo)

# Last observation carried forward from last row of group
dt <- dt[, colB := na.locf0(colB, fromLast = TRUE), by = colA]

# Last observation carried forward for first row of group
dt[, colB := na.locf(colB), by = colA][]

Or in a single chain:

dt[, colB := na.locf0(colB, fromLast = TRUE), by = colA][
   , colB := na.locf(colB), by = colA][]

Both return:

    colA colB
 1:    1    4
 2:    1    1
 3:    1    1
 4:    1    1
 5:    2    4
 6:    2    3
 7:    2    3
 8:    2    3
 9:    3    4
10:    3    2
11:    3    2
12:    3    2

Data:

text <- "colA colB
    1    4
    1   NA
    1   NA
    1    1
    2    4
    2    3
    2   NA
    2   NA
    3    4
    3   NA
    3    2
    3   NA"

dt <- fread(input = text, stringsAsFactors = FALSE)

data.table fill missing values from other rows by group

Tags:

r

na

data.table

row

LeGeniusII

1 Answers

tyluRp

Recent Activity

Donate For Us

data.table fill missing values from other rows by group

Tags:

r

na

data.table

row

LeGeniusII

1 Answers

tyluRp

Related questions

Recent Activity

Donate For Us