Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

make sum of an empty set/set of NA's NA instead of 0?

Tags:

r

na

sum

The sum function returns 0 if it is applied to an empty set. Is there a simple way to make it return NA if it is applied to a set of NA values?

Here is a borrowed example:

test <- data.frame(name = rep(c("A", "B", "C"), each = 4),
               var1 = rep(c(1:3, NA), 3),
               var2 = 1:12,
               var3 = c(rep(NA, 4), 1:8))

test
    name var1 var2 var3
1     A    1    1   NA
2     A    2    2   NA
3     A    3    3   NA
4     A   NA    4   NA
5     B    1    5    1
6     B    2    6    2
7     B    3    7    3
8     B   NA    8    4
9     C    1    9    5
10    C    2   10    6
11    C    3   11    7
12    C   NA   12    8

I would like to have per name the sum of the three variables. Here is what I tried:

var_to_aggr <- c("var1","var2","var3")
aggr_by <- "name"
summed <- aggregate(test[var_to_aggr],by=test[aggr_by],FUN="sum", na.rm = TRUE)

This gives me:

     name var1 var2 var3
1    A    6   10   0
2    B    6   26   10
3    C    6   42   26

But I need:

     name var1 var2 var3
1    A    6   10   NA
2    B    6   26   10
3    C    6   42   26

The sum for name A, var3 should be NA and not 0. (just to be clear, it should not be NA for name A, var1, where the set contains one NA but also valid values that should be summed up). Any ideas?

I have been fiddling with na.action but sum doesn't seem to accept these.

like image 461
Kastany Avatar asked May 21 '15 10:05

Kastany


People also ask

How do I sum a column with NA in R?

To find the sum of non-missing values in an R data frame column, we can simply use sum function and set the na. rm to TRUE. For example, if we have a data frame called df that contains a column say x which has some missing values then the sum of the non-missing values can be found by using the command sum(df$x,na.

Why is sum giving me na?

If you are summing floating-point numbers, you can't have an integer overflow (floats are not integers) Do you have NA s in your data? If you sum anything with NA s present, the result will be NA , unless you handle it properly.


1 Answers

You can try

f1 <- function(x) if(all(is.na(x))) NA_integer_ else sum(x, na.rm=TRUE)
aggregate(.~name, test, FUN=f1, na.action=NULL)

Or

library(dplyr)
test %>% 
   group_by(name) %>% 
   summarise_each(funs(f1))

Or

 library(data.table)
 setDT(test)[, lapply(.SD, f1), name]
like image 78
akrun Avatar answered Sep 29 '22 14:09

akrun