Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In R, can I make the table() function return the number of NA values in a named element?

Tags:

r

na

I am using R to summarize a large amount of data for a report. I want to be able to use lapply() to generate a list of tables from the table() function, from which I can extract my desired statistics. There are a lot of these, so I've written a function to do it. My issue is that I am having difficulty returning the number of missing (NA) values even though I have that in each table, because I can't figure out how to tell R that I want the element from table() that holds the number of NA values. As far as I can tell, R is "naming" that element NA...and I can't call that.

I'm trying to avoid writing some complex statement where I say something like which(is.na(names(element[1]))) | names(element[1])=="var_I_want" because I feel like that's just really wordy. I was hoping there was some way to either tell R to label the NA variable in each table with a character name, or to tell it to pick the one labeled NA, but I haven't had much luck yet.

Minimal example:

example <- data.frame(ID=c(10,20,30,40,50),
                      V1=c("A","B","A",NA,"C"),
                      V2=c("Dog","Cat",NA,"Cat","Bunny"),
                      V3=c("Yes","No","No","Yes","No"),
                      V4=c("No",NA,"No","No","Yes"),
                      V5=c("No","Yes","Yes",NA,"No"))

varlist <- c("V1","V2","V3","V4","V5")

list_o_tables <- lapply(X=example[varlist],FUN=table,useNA="always")

list(V1=list_o_tables[["V1"]]["A"],
     V2=list_o_tables[["V2"]]["Cat"],
     V3=list_o_tables[["V3"]]["Yes"],
     V4=list_o_tables[["V4"]]["Yes"],
     V5=list_o_tables[["V5"]]["Yes"])

What I get:

$V1
A 
2 

$V2
Cat 
  2 

$V3
Yes 
  2 

$V4
Yes 
  1 

$V5
Yes 
  2

What I'd like:

$V1
A     <NA>
2       1

$V2
Cat   <NA>
  2     1

$V3
Yes   <NA> 
  2     0

$V4
Yes   <NA> 
  1     1

$V5
Yes   <NA> 
  2     1
like image 242
TARehman Avatar asked Dec 06 '13 22:12

TARehman


People also ask

How do I count Na in a table in R?

Counting NA s across either rows or columns can be achieved by using the apply() function. This function takes three arguments: X is the input matrix, MARGIN is an integer, and FUN is the function to apply to each row or column. MARGIN = 1 means to apply the function across rows and MARGIN = 2 across columns.

What does the function is NA () do in R?

To see which values in each of these vectors R recognizes as missing, we can use the is.na function. It will return a TRUE/FALSE vector with as any elements as the vector we provide. We can see that R distinguishes between the NA and “NA” in x2–NA is seen as a missing value, “NA” is not.

How do I count missing values in a Dataframe in R?

In order to find the missing values in all columns use apply function with the which and the sum function in is.na() method.

How do I find NA data in R?

In R, the easiest way to find columns that contain missing values is by combining the power of the functions is.na() and colSums(). First, you check and count the number of NA's per column. Then, you use a function such as names() or colnames() to return the names of the columns with at least one missing value.


1 Answers

This is ugly (IMHO) but it works:

my_table <- function(x){
    setNames(table(x,useNA = "always"),c(sort(unique(x[!is.na(x)])),'NA'))
}

So you'd lapply this instead, and then you'd have access to the NA column.

Looking more closely, this is rooted in the behavior of factor:

levels(factor(c(1,NA,2),exclude = NULL))
[1] "1" "2" NA 

My recollection is that the distinction between a factor level of NA versus "NA" has been at the very least a source of confusion in R in the past. I feel like I've seen some debates about the merits of this on r-devel, but I can't recall for sure at the moment.

So the issue is, if you have a factor with NA values, what do you call the levels? Technically, this is correct, one of the levels is "missing" not literally "NA". It would be nice (IMHO) if table didn't adhere to this quite so strictly, though.

like image 76
joran Avatar answered Nov 11 '22 12:11

joran