How to summarise a categorical variable with missing data?

Question

I'm trying to perform a group_by summarise on a categorical variable, frailty score. The data is structured such that there are multiple observations for each subject, some of which contain missing data e.g.

Subject  Frailty
1        Managing well
1        NA
1        NA
2        NA
2        NA
2        Vulnerable
3        NA
3        NA
3        NA

I would like the data to be summarised so that a frailty description appears if there is one available, and NA if not e.g.

Subject  Frailty
1        Managing well
2        Vulnerable 
3        NA

I tried the following two approaches which both returned errors:

Mode <- function(x) {
ux <- na.omit(unique(x[!is.na(x)]))
tab <- tabulate(match(x, ux)); ux[tab == max(tab)]
}

data %>% 
group_by(Subject) %>% 
summarise(frailty = Mode(frailty)) %>% 

Error: Expecting a single value: [extent=2].

condense <- function(x){unique(x[!is.na(x)])}

data %>% 
group_by(subject) %>% 
summarise(frailty = condense(frailty))

Error: Column frailty must be length 1 (a summary value), not 0

tmfmnk · Accepted Answer

One solution involving dplyr could be:

df %>%
 group_by(Subject) %>%
 slice(which.min(is.na(Frailty)))

  Subject Frailty      
    <int> <chr>        
1       1 Managing_well
2       2 Vulnerable   
3       3 <NA>

How to summarise a categorical variable with missing data?

Tags:

r

missing-data

categorical-data

summary

lexicalgap

1 Answers

tmfmnk

Recent Activity

Donate For Us

How to summarise a categorical variable with missing data?

Tags:

r

missing-data

categorical-data

summary

lexicalgap

1 Answers

tmfmnk

Related questions

Recent Activity

Donate For Us