Find names of columns which contain missing values

Tags:

r

na

I want to find all the names of columns with NA or missing data and store these column names in a vector.

# create matrix
a <- c(1,2,3,4,5,NA,7,8,9,10,NA,12,13,14,NA,16,17,18,19,20)
cnames <- c("aa", "bb", "cc", "dd", "ee")
mymatrix <- matrix(a, nrow = 4, ncol = 5, byrow = TRUE)
colnames(mymatrix) <- cnames
mymatrix
#      aa bb cc dd ee
# [1,]  1  2  3  4  5
# [2,] NA  7  8  9 10
# [3,] NA 12 13 14 NA
# [4,] 16 17 18 19 20

The desired result: columns "aa" and "ee".

My attempt:

bad <- character()
for (j in 1:4){     
  tmp <- which(colnames(mymatrix[j, ]) %in% c("", "NA"))
  bad <- tmp
}

However, I keep getting integer(0) as my output. Any help is appreciated.

823

asked Dec 04 '13 00:12

lever

3 Answers

Like this?

colnames(mymatrix)[colSums(is.na(mymatrix)) > 0]
# [1] "aa" "ee"

Or as suggested by @thelatemail:

names(which(colSums(is.na(mymatrix)) > 0))
# [1] "aa" "ee"

103

answered Oct 21 '22 18:10

Henrik

R 3.1 introduced an anyNA function, which is more convenient and faster:

colnames(mymatrix)[ apply(mymatrix, 2, anyNA) ]

Old answer:

If it's a very long matrix, apply + any can short circuit and run a bit faster.

apply(is.na(mymatrix), 2, any)
#   aa    bb    cc    dd    ee 
# TRUE FALSE FALSE FALSE  TRUE 
colnames(mymatrix)[apply(is.na(mymatrix), 2, any)]
# [1] "aa" "ee"

answered Oct 21 '22 16:10

Neal Fultz

If you have a data frame with non-numeric columns, this solution is more general (building on previous answers):

R 3.1 +

names(which(sapply(mymatrix, anyNA)))

names(which(sapply(mymatrix, function(x) any(is.na(x)))))

answered Oct 21 '22 16:10

verbamour

Related questions
                            
                                geom_text how to position the text on bar as I want?
                            
                                Special characters and superscripts on plot axis titles
                            
                                Cannot load R xlsx package on Mac OS 10.11
                            
                                ggplot2 and geom_density: How to remove baseline?
                            
                                How to make the horizontal scrollbar visible in DT::datatable
                            
                                Changing Fonts for Graphs in R
                            
                                Better error message for stopifnot?
                            
                                Are there best/recommended practices to follow when renaming functions in a new version of a package?
                            
                                Read all files in directory and apply multiple functions to each data frame [duplicate]
                            
                                Error: stat_count() in ggplot2
                            
                                Pattern matching using a wildcard
                            
                                insert side by side png images using knitr
                            
                                Split a string vector at whitespace
                            
                                How to add a page break in word document generated by RStudio & markdown
                            
                                Merging two columns into one in R [duplicate]
                            
                                Wind rose with ggplot (R)?
                            
                                R list of lists to data.frame
                            
                                Replace NA with 0 in a data frame column [duplicate]
                            
                                Remove rows where all variables are NA using dplyr
                            
                                How to assign a unique ID number to each group of identical values in a column [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With