Impute missing data with mean by group

Tags:

I have a categorical variable with three levels (A, B, and C).

I also have a continuous variable with some missing values on it.

I would like to replace the NA values with the mean of its group. This is, missing observations from group A has to be replaced with the mean of group A.

I know I can just calculate each group's mean and replace missing values, but I'm sure there's another way to do so more efficiently with loops.

A <- subset(data, group == "A")
mean(A$variable, rm.na = TRUE)
A$variable[which(is.na(A$variable))] <- mean(A$variable, na.rm = TRUE)

Now, I understand I could do the same for group B and C, but perhaps a for loop (with if and else) might do the trick?

696

asked Mar 25 '19 20:03

Jonatan Ottino

1 Answers

require(dplyr)
data %>% group_by(group) %>%
mutate(variable=ifelse(is.na(variable),mean(variable,na.rm=TRUE),variable))

For a faster, base-R version, you can use ave:

data$variable<-ave(data$variable,data$group,FUN=function(x) 
  ifelse(is.na(x), mean(x,na.rm=TRUE), x))

answered Sep 19 '22 21:09

iod

Related questions
                            
                                How to pass input variable to SQL statement in R shiny?
                            
                                Compute a kernel ridge regression in R for model selection
                            
                                ggplot2: manually add a legend
                            
                                rmarkdown: manipulate chunk options programmatically?
                            
                                sprintf seems to ignore some special characters
                            
                                Error handling within Sexpr
                            
                                Adding multicolumns to my texreg output
                            
                                Band-pass filter in R: weird behaviour at the end of time series
                            
                                How to plot user-defined functions in R?
                            
                                Warning message when opening RStudio or the R console
                            
                                How to make vertical scrollbar appear in RMarkdown code chunks (html view)
                            
                                Converting R list to JSON
                            
                                Cumulative look-back rolling join
                            
                                rollapply for large data using sparklyr
                            
                                R plotly hover label text alignment
                            
                                R: fetching pdf documents from Companies House API
                            
                                Can you have multiple plans using R package drake?
                            
                                Consistently center ggplot title across PANEL not PLOT
                            
                                Inconsistent predictions from predict.gbm()
                            
                                using external classes with Shiny, R and futures

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Impute missing data with mean by group

Tags:

loops

r

missing-data

imputation

Jonatan Ottino

People also ask

1 Answers

iod

Recent Activity

Donate For Us