How to fill NAs based on a grouped variable?

Question

My test data:

tableid  id    name    weight   weightdate
1        1    david    100      01/01/2020
2        1    david    100      01/01/2020
3        1    david    NA       NA
4        2    anne     NA       NA
5        3    peter    150      02/10/2020
6        3    peter    150      02/10/2020

I have some records of a same person (in this case david), which have both valid (100 and 01/01/2020) and NA data in weight and weightdate. Im looking to normalize the NA by replacing them with valid data from that person. For the case of anne who doesnt have any valid data, i would leave it as is. I was thinking about using fill.

My desired output:

tableid  id    name    weight   weightdate
1        1    david    100      01/01/2020
2        1    david    100      01/01/2020
3        1    david    100      01/01/2020
4        2    anne     NA       NA
5        3    peter    150      02/10/2020
6        3    peter    150      02/10/2020

Andre Wildberg · Accepted Answer

You can use fill() function from tidyr package:

library(tidyr)

df1 %>% group_by(name) %>% fill(weight, weightdate) %>% ungroup
# A tibble: 6 x 5
  tableid    id name  weight weightdate
    <int> <int> <chr>  <int> <chr>     
1       1     1 david    100 01/01/2020
2       2     1 david    100 01/01/2020
3       3     1 david    100 01/01/2020
4       4     2 anne      NA NA        
5       5     3 peter    150 02/10/2020
6       6     3 peter    150 02/10/2020

Data

df1 <- structure(list(tableid = 1:6, id = c(1L, 1L, 1L, 2L, 3L, 3L), 
    name = c("david", "david", "david", "anne", "peter", "peter"
    ), weight = c(100L, 100L, NA, NA, 150L, 150L), weightdate = c("01/01/2020", 
    "01/01/2020", NA, NA, "02/10/2020", "02/10/2020")), class = "data.frame", row.names = c(NA, 
-6L))

TarJae · Answer

This is an alternative. Andre Wildberg already provided my favorite:

library(dplyr)
df %>% 
  group_by(name) %>% 
  mutate(across(everything(), ~ifelse(is.na(.), lag(.),.)))

  tableid    id name  weight weightdate
    <int> <int> <chr>  <int> <chr>     
1       1     1 david    100 01/01/2020
2       2     1 david    100 01/01/2020
3       3     1 david    100 01/01/2020
4       4     2 anne      NA NA        
5       5     3 peter    150 02/10/2020
6       6     3 peter    150 02/10/2020

data:

structure(list(tableid = 1:6, id = c(1L, 1L, 1L, 2L, 3L, 3L), 
name = c("david", "david", "david", "anne", "peter", "peter"
), weight = c(100L, 100L, NA, NA, 150L, 150L), weightdate = c("01/01/2020", 
"01/01/2020", NA, NA, "02/10/2020", "02/10/2020")), class = "data.frame", row.names = c(NA, 
-6L))

How to fill NAs based on a grouped variable?

Tags:

r

Andres Mora

2 Answers

Data

Andre Wildberg

TarJae

Recent Activity

Donate For Us

How to fill NAs based on a grouped variable?

Tags:

r

Andres Mora

2 Answers

Data

Andre Wildberg

TarJae

Related questions

Recent Activity

Donate For Us