Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

conditional count by group and create a new vector

I want to create a new vector or a new object matching the number of unique id that are not associated with an age variable (missing data).

Please find a MWE below.

df <- data.frame(id=c('a', 'b', 'b', 'c', 'c', 'd', 'e', 'e', 'e', 'f', 'f', 'g', 'h', 'h'),
                 age=c(12, NA, NA, 25, 25, 26, NA, NA, NA, 3, 3, NA, 21, 21))

In that case the result is 3.

Thanks.

like image 981
CharlesLDN Avatar asked Dec 01 '25 01:12

CharlesLDN


2 Answers

IIUC you can try this:

library(dplyr)

df <- data.frame(id=c('a', 'b', 'b', 'c', 'c', 'd', 'e', 'e', 'e', 'f', 'f', 'g', 'h', 'h'),
                 age=c(12, NA, NA, 25, 25, 26, NA, NA, NA, 3, 3, NA, 21, 21))

(df
  |> group_by(id)                          # Group by ID
  |> summarise(all_na = all(is.na(age)))   # New column all_na, which is TRUE if all age values in that group are NA.s
  |> filter(all_na)                        # Only keep groups where the new column all_na is TRUE
  |> nrow())                               # Count the result

which returns 3 as expected

like image 84
Robert Long Avatar answered Dec 03 '25 17:12

Robert Long


You could do

length(unique(df$id[is.na(df$age)]))

> df$id[is.na(df$age)] # ids which are not associated with an age
[1] "b" "b" "e" "e" "e" "g"
> unique(df$id[is.na(df$age)]) # unique of that
[1] "b" "e" "g"
> length(unique(df$id[is.na(df$age)])) # length of that
[1] 3

data

df <- data.frame(id=c('a', 'b', 'b', 'c', 'c', 'd', 'e', 'e', 'e', 'f', 'f', 'g', 'h', 'h'),age=c(12, NA, NA, 25, 25, 26, NA, NA, NA, 3, 3, NA, 21, 21))
like image 44
Tim G Avatar answered Dec 03 '25 17:12

Tim G



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!