Calculate using dplyr, percentage of NA'S in each column

Tags:

I have a data frame with some columns with missing values. Is there a way (using dplyr) to efficiently calculate the percentage of each column that is missing i.e. NA. Sought of like a colSum equivalent. So I dont have to calculate each column percentage missing individually ?

760

asked Nov 04 '15 02:11

MP61

2 Answers

First, I created a test data for you:

a<- c(1,NA,NA,4)
b<- c(NA,2,3,4)
x<- data.frame(a,b)
x
#    a  b
# 1  1 NA
# 2 NA  2
# 3 NA  3
# 4  4  4

Then you can use colMeans(is.na(x)) :

colMeans(is.na(x))
#    a    b 
# 0.50 0.25

answered Oct 17 '22 07:10

Gavin

We can use summarise_each

 library(dplyr)
 x %>% 
   summarise_each(funs(100*mean(is.na(.))))

answered Oct 17 '22 05:10

akrun

Related questions
                            
                                Dummify character column and find unique values [duplicate]
                            
                                summing multiple columns in an R data-frame quickly [duplicate]
                            
                                Remove duplicate element within a row in a specific column
                            
                                Coalesce pairs of variables within a dataframe based on a regular expression
                            
                                Perform 'cross product' of two vectors, but with addition
                            
                                ImageMagick in R
                            
                                How to rename specific variable of a data frame with setNames()?
                            
                                r keeping 0.0 when using paste or paste0
                            
                                How to visualize a map from a netcdf file?
                            
                                Removing NA in correlation matrix
                            
                                The difference between & and && in R
                            
                                R cumulative sum by condition with reset
                            
                                How to get names of dot-dot-dot arguments in R [duplicate]
                            
                                Sorting rows alphabetically
                            
                                Preventing R From Rounding
                            
                                Position-dodge warning with ggplot boxplot?
                            
                                How can I count the number of times a value occurs in a column of a dataframe?
                            
                                Split word in column in R
                            
                                Call by reference in R (using function to modify an object)
                            
                                How to add new calculated variables to a data frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With