I want to create a new vector or a new object matching the number of unique id that are not associated with an age variable (missing data).
Please find a MWE below.
df <- data.frame(id=c('a', 'b', 'b', 'c', 'c', 'd', 'e', 'e', 'e', 'f', 'f', 'g', 'h', 'h'),
age=c(12, NA, NA, 25, 25, 26, NA, NA, NA, 3, 3, NA, 21, 21))
In that case the result is 3.
Thanks.
IIUC you can try this:
library(dplyr)
df <- data.frame(id=c('a', 'b', 'b', 'c', 'c', 'd', 'e', 'e', 'e', 'f', 'f', 'g', 'h', 'h'),
age=c(12, NA, NA, 25, 25, 26, NA, NA, NA, 3, 3, NA, 21, 21))
(df
|> group_by(id) # Group by ID
|> summarise(all_na = all(is.na(age))) # New column all_na, which is TRUE if all age values in that group are NA.s
|> filter(all_na) # Only keep groups where the new column all_na is TRUE
|> nrow()) # Count the result
which returns 3 as expected
You could do
length(unique(df$id[is.na(df$age)]))
> df$id[is.na(df$age)] # ids which are not associated with an age
[1] "b" "b" "e" "e" "e" "g"
> unique(df$id[is.na(df$age)]) # unique of that
[1] "b" "e" "g"
> length(unique(df$id[is.na(df$age)])) # length of that
[1] 3
df <- data.frame(id=c('a', 'b', 'b', 'c', 'c', 'd', 'e', 'e', 'e', 'f', 'f', 'g', 'h', 'h'),age=c(12, NA, NA, 25, 25, 26, NA, NA, NA, 3, 3, NA, 21, 21))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With