Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bind residuals to input dataset with missing values [duplicate]

I am looking for a method to bind lm residuals to an input dataset. The method must add NA for missing residuals and the residuals should correspond to the proper row.

Sample data:

N <- 100 
Nrep <- 5 
X <- runif(N, 0, 10) 
Y <- 6 + 2*X + rnorm(N, 0, 1) 
X[ sample(which(Y < 15), Nrep) ] <- NA
df <- data.frame(X,Y)

residuals(lm(Y ~ X,data=df,na.action=na.omit))

Residuals should be bound to df.

like image 331
metasequoia Avatar asked Jan 15 '23 20:01

metasequoia


2 Answers

Simply change the na.action to na.exclude:

residuals(lm(Y ~ X, data = df, na.action = na.exclude))

na.omit and na.exclude both do casewise deletion with respect to both predictors and criterions. They only differ in that extractor functions like residuals() or fitted() will pad their output with NAs for the omitted cases with na.exclude, thus having an output of the same length as the input variables.

(this is the best solution found here)

like image 151
Tomas Avatar answered Jan 17 '23 08:01

Tomas


Using merge, or join.

N <- 100 
Nrep <- 5 
X <- runif(N, 0, 10) 
Y <- 6 + 2*X + rnorm(N, 0, 1) 
X[ sample(which(Y < 15), Nrep) ] <- NA
df <- data.frame(X,Y)

df$id <- rownames(df)

res <- residuals(lm(Y ~ X,data=df,na.action=na.omit))
tmp <- data.frame(res=res)
tmp$id <- names(res)

merge(df,tmp,by="id",sort=FALSE,all.x=TRUE)

If you need to maintain the order. Use join() from the plyr package:

library(plyr) 
join(df,tmp)
like image 43
Brandon Bertelsen Avatar answered Jan 17 '23 08:01

Brandon Bertelsen