Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the percentage of NAs in a data.frame?

Tags:

dataframe

r

csv

na

I am trying to find the percentage of NAs in columns as well as inside the whole dataframe:

The first method which I have commented gives me zero and the second method which is not commented gives me a matrix. Not sure what I am missing. Any hint is truly appreciated!

cp.2006<-read.csv(file="cp2006.csv",head=TRUE)

#countNAs <- function(x) { 
#  sum(is.na(x)) 
#} 
#total=0
#for (i in col(cp.2006)) {
#  total=countNAs(i)+total
#}
#print(total)
count<-apply(cp.2006, 1, function(x) sum(is.na(x)))
dims<-dim(cp.2006)
num<-dims[1]*dims[2]
NApercentage<-(count/num) * 100
print(NApercentage)
like image 655
Mona Jalal Avatar asked May 11 '14 19:05

Mona Jalal


People also ask

What is NaN percentage?

Answer. NaN is short for Not a Number. NaN indicates that the monitoring system is not receiving any numeric data.

How do I calculate a percentage in R?

To calculate percent, we need to divide the counts by the count sums for each sample, and then multiply by 100. This can also be done using the function decostand from the vegan package with method = "total" .

How do I calculate missing percentage in Excel?

E.g. the number of missing data elements for the read variable (cell G6) is 15, as calculated by the formula =COUNT(B4:B23). Since there are 20 rows in the data range the percentage of non-missing cells for read (cell G7) is 15/20 = 75%, which can be calculated by =G6/COUNTA(B4:B23).


1 Answers

x = data.frame(x = c(1, 2, NA, 3), y = c(NA, NA, 4, 5))

For the whole dataframe:

sum(is.na(x))/prod(dim(x))

Or

mean(is.na(x))

For columns:

apply(x, 2, function(col)sum(is.na(col))/length(col))

Or

colMeans(is.na(x))
like image 82
Fernando Avatar answered Oct 02 '22 14:10

Fernando