filter for complete cases in data.frame using dplyr (case-wise deletion)

Tags:

Is it possible to filter a data.frame for complete cases using dplyr? complete.cases with a list of all variables works, of course. But that is a) verbose when there are a lot of variables and b) impossible when the variable names are not known (e.g. in a function that processes any data.frame).

library(dplyr) df = data.frame(     x1 = c(1,2,3,NA),     x2 = c(1,2,NA,5) )  df %.%   filter(complete.cases(x1,x2))

335

asked Mar 12 '14 13:03

user2503795

1 Answers

Try this:

df %>% na.omit

or this:

df %>% filter(complete.cases(.))

or this:

library(tidyr) df %>% drop_na

If you want to filter based on one variable's missingness, use a conditional:

df %>% filter(!is.na(x1))

df %>% drop_na(x1)

Other answers indicate that of the solutions above na.omit is much slower but that has to be balanced against the fact that it returns row indices of the omitted rows in the na.action attribute whereas the other solutions above do not.

str(df %>% na.omit) ## 'data.frame':   2 obs. of  2 variables: ##  $ x1: num  1 2 ##  $ x2: num  1 2 ##  - attr(*, "na.action")= 'omit' Named int  3 4 ##    ..- attr(*, "names")= chr  "3" "4"

ADDED Have updated to reflect latest version of dplyr and comments.

ADDED Have updated to reflect latest version of tidyr and comments.

144

answered Oct 17 '22 02:10

G. Grothendieck

Related questions
                            
                                R programming: How do I get Euler's number?
                            
                                Left align two graph edges (ggplot)
                            
                                Paste multiple columns together
                            
                                How to randomize (or permute) a dataframe rowwise and columnwise?
                            
                                Subscripts in plots in R
                            
                                How to remove outliers from a dataset
                            
                                dplyr summarise: Equivalent of ".drop=FALSE" to keep groups with zero length in output
                            
                                Relationship between R Markdown, Knitr, Pandoc, and Bookdown
                            
                                How to put labels over geom_bar for each bar in R with ggplot2
                            
                                How to change the default font size in ggplot2
                            
                                How can I manipulate the strip text of facet_grid plots?
                            
                                R for loop skip to next iteration ifelse
                            
                                R: Comment out block of code [duplicate]
                            
                                How to parse XML to R data frame
                            
                                How to change 'Maximum upload size exceeded' restriction in Shiny and save user file inputs?
                            
                                How to not run an example using roxygen2?
                            
                                R dplyr: Drop multiple columns
                            
                                How to round up to the nearest 10 (or 100 or X)?
                            
                                Pass column name in data.table using variable [duplicate]
                            
                                How do I change the background color of a plot made with ggplot2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

filter for complete cases in data.frame using dplyr (case-wise deletion)

Tags:

r

dplyr

magrittr

user2503795

People also ask

1 Answers

G. Grothendieck

Recent Activity

Donate For Us