The sum function returns 0 if it is applied to an empty set. Is there a simple way to make it return NA if it is applied to a set of NA values? Here is a borrowed example: <pre class="prettyprint"><code>test <- data.frame(name = rep(c("A", "B", "C"), each = 4), var1 = rep(c(1:3, NA), 3), var2 = 1:12, var3 = c(rep(NA, 4), 1:8)) test name var1 var2 var3 1 A 1 1 NA 2 A 2 2 NA 3 A 3 3 NA 4 A NA 4 NA 5 B 1 5 1 6 B 2 6 2 7 B 3 7 3 8 B NA 8 4 9 C 1 9 5 10 C 2 10 6 11 C 3 11 7 12 C NA 12 8 </code></pre> I would like to have per name the sum of the three variables. Here is what I tried: <pre class="prettyprint"><code>var_to_aggr <- c("var1","var2","var3") aggr_by <- "name" summed <- aggregate(test[var_to_aggr],by=test[aggr_by],FUN="sum", na.rm = TRUE) </code></pre> This gives me: <pre class="prettyprint"><code> name var1 var2 var3 1 A 6 10 0 2 B 6 26 10 3 C 6 42 26 </code></pre> But I need: <pre class="prettyprint"><code> name var1 var2 var3 1 A 6 10 NA 2 B 6 26 10 3 C 6 42 26 </code></pre> The sum for name A, var3 should be NA and not 0. (just to be clear, it should not be NA for name A, var1, where the set contains one NA but also valid values that should be summed up). Any ideas? I have been fiddling with na.action but sum doesn't seem to accept these.

You can try <pre class="prettyprint"><code>f1 <- function(x) if(all(is.na(x))) NA_integer_ else sum(x, na.rm=TRUE) aggregate(.~name, test, FUN=f1, na.action=NULL) </code></pre> Or <pre class="prettyprint"><code>library(dplyr) test %>% group_by(name) %>% summarise_each(funs(f1)) </code></pre> Or <pre class="prettyprint"><code> library(data.table) setDT(test)[, lapply(.SD, f1), name] </code></pre>

make sum of an empty set/set of NA's NA instead of 0?

Tags:

r

na

sum

The sum function returns 0 if it is applied to an empty set. Is there a simple way to make it return NA if it is applied to a set of NA values?

Here is a borrowed example:

test <- data.frame(name = rep(c("A", "B", "C"), each = 4),
               var1 = rep(c(1:3, NA), 3),
               var2 = 1:12,
               var3 = c(rep(NA, 4), 1:8))

test
    name var1 var2 var3
1     A    1    1   NA
2     A    2    2   NA
3     A    3    3   NA
4     A   NA    4   NA
5     B    1    5    1
6     B    2    6    2
7     B    3    7    3
8     B   NA    8    4
9     C    1    9    5
10    C    2   10    6
11    C    3   11    7
12    C   NA   12    8

I would like to have per name the sum of the three variables. Here is what I tried:

var_to_aggr <- c("var1","var2","var3")
aggr_by <- "name"
summed <- aggregate(test[var_to_aggr],by=test[aggr_by],FUN="sum", na.rm = TRUE)

This gives me:

     name var1 var2 var3
1    A    6   10   0
2    B    6   26   10
3    C    6   42   26

But I need:

     name var1 var2 var3
1    A    6   10   NA
2    B    6   26   10
3    C    6   42   26

The sum for name A, var3 should be NA and not 0. (just to be clear, it should not be NA for name A, var1, where the set contains one NA but also valid values that should be summed up). Any ideas?

I have been fiddling with na.action but sum doesn't seem to accept these.

461

asked May 21 '15 10:05

Kastany

1 Answers

You can try

f1 <- function(x) if(all(is.na(x))) NA_integer_ else sum(x, na.rm=TRUE)
aggregate(.~name, test, FUN=f1, na.action=NULL)

library(dplyr)
test %>% 
   group_by(name) %>% 
   summarise_each(funs(f1))

 library(data.table)
 setDT(test)[, lapply(.SD, f1), name]

answered Sep 29 '22 14:09

akrun

Related questions
                            
                                Correlation between numeric and logical variable gives (intended) error?
                            
                                R, using Knitr to view a table in HTML
                            
                                Reorganize list into dataframe using dplyr
                            
                                In R print decimal comma instead of decimal point
                            
                                "could not find function" only when in the R debugger
                            
                                R: workaround for variable-width lookbehind
                            
                                line by line debugging in R studio
                            
                                Programmatic subsetting of a data.table in R
                            
                                chordDiagram function, R package circlize
                            
                                R fill in NA with previous row value with condition
                            
                                how to kill parallel program of R in Linux
                            
                                R equivalent of the Matlab spy function
                            
                                Dplyr summarise_each to aggregate results
                            
                                Extracting RColorBrewer palette for other use
                            
                                how do you convert output from readLines to data frame in R
                            
                                R - Compare two data frames of different length for same values in two columns
                            
                                Multi-character plot shapes in ggplot
                            
                                convert list of sparse matrix indices to matrix in R
                            
                                Finding common rows in R
                            
                                Is there a way to create Stata's _merge indicator variable with R's merge()?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With