Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Extract complete cases/included observations from linear model or formula variables

Tags:

r

data.table

After running m1 <- lm(f1, data=DT) I want to save the observations that are included (akin to "obs <- complete.cases(m1)", but something that works) so that I can run a second regression on the same observations: m2 <- lm(f2, data=DT[obs]).

Alternatively, I would like to get the observations that are complete for a given set of variables as defined by a formula object. Consider this R-like pseudocode:

f1 <- as.formula("y ~ x1 + x2 + x3")
f2 <- as.formula("y ~ x1 + x2")
obs <- complete.cases(DT[,list(all.vars(f1)])
m2 <- lm(f2, data=DT[obs])

How do I do this? In the first case, lm already does the work implicitly; how can I extract it? In the second, all.vars returns a character vector; how do I properly create an unquoted list that DT (data.table) will understand?

like image 641
rjturn Avatar asked Mar 07 '15 18:03

rjturn


People also ask

How do you keep only the whole cases in R?

Complete cases in R, To eliminate missing values from a vector, matrix, or data frame, use the complete. cases() function in R. The following is the fundamental syntax for this function. In any column in the data frame, remove rows with missing values.

What does complete cases do?

complete. cases() function in R Language is used to return a logical vector with cases which are complete, i.e., no missing value.

How do I complete data in R?

We can use complete. cases() to print a logical vector that indicates complete and missing rows (i.e. rows without NA). Rows 2 and 3 are complete; Rows 1, 4, and 5 have one or more missing values. We can also create a complete subset of our example data by using the complete.

How do you fit a linear model in R?

To fit a linear model in the R Language by using the lm() function, We first use data. frame() function to create a sample data frame that contains values that have to be fitted on a linear model using regression function. Then we use the lm() function to fit a certain function to a given data frame.


1 Answers

From data.table v1.9.5, na.omit has a cols argument.

na.omit(DT, cols = all.vars(f))
like image 91
Arun Avatar answered Oct 22 '22 17:10

Arun