Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select rows in a data.frame without NA values [closed]

Tags:

dataframe

r

I have a data frame called data. I want to create a function f(data, collist). This function takes data and a list of columns from data itself, and returns only those rows from data, for which the mentioned column names in collist are not NA. I know it can be done using for loop, but I want to do it without using for loop.

Also, please let me know if it is generally more efficient in R to avoid loops.

Here is an example:

 A   B   C   D
 1   2  NA  NA
 2  NA  NA  NA
NA   3   7   5
NA   4   2  NA
 5   6  NA  NA

If collist contains B and C, then a reduced data frame with row number 1,3,4 would be returned. The reason being either B or C or both has NA in rows 2 and 5. I want a function, because I will be using this operation quite a number of times. Through this question, I will learn some new R tricks, as well as, make my whole program more elegant. Thanks.

like image 808
Sumit Avatar asked Nov 08 '13 17:11

Sumit


People also ask

How do you select rows that are not na?

To select rows of an R data frame that are non-Na, we can use complete. cases function with single square brackets. For example, if we have a data frame called that contains some missing values (NA) then the selection of rows that are non-NA can be done by using the command df[complete. cases(df),].

How do you omit Na in a data frame?

To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).

How do I ignore na data in R?

First, if we want to exclude missing values from mathematical operations use the na. rm = TRUE argument. If you do not exclude these values most functions will return an NA . We may also desire to subset our data to obtain complete observations, those observations (rows) in our data that contain no missing data.

How do I remove rows containing NA values in R?

Remove Rows with NA From R Dataframe. By using na. omit() , complete. cases() , rowSums() , and drop_na() methods you can remove rows that contain NA ( missing values) from R data frame.


1 Answers

It sounds like you are just looking for complete.cases. Here's an example:

#### SAMPLE DATA

set.seed(1)
m <- matrix(rnorm(20), 5)
m[sample(length(m), 7)] <- NA
mydf <- data.frame(m)
mydf
#           X1         X2        X3          X4
# 1         NA -0.8204684  1.511781 -0.04493361
# 2  0.1836433  0.4874291        NA          NA
# 3 -0.8356286  0.7383247        NA  0.94383621
# 4  1.5952808         NA -2.214700  0.82122120
# 5  0.3295078         NA        NA  0.59390132

#### SAMPLE EXTRACTION

collist <- c("X1", "X2")
mydf[complete.cases(mydf[collist]), collist]
#           X1        X2
# 2  0.1836433 0.4874291
# 3 -0.8356286 0.7383247
like image 120
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 13 '22 11:10

A5C1D2H2I1M1N2O1R2T1