Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate cumsum() while ignoring NA values

Tags:

r

Consider the following named vector x.

( x <- setNames(c(1, 2, 0, NA, 4, NA, NA, 6), letters[1:8]) )
# a  b  c  d  e  f  g  h 
# 1  2  0 NA  4 NA NA  6 

I'd like to calculate the cumulative sum of x while ignoring the NA values. Many R functions have an argument na.rm which removes NA elements prior to calculations. cumsum() is not one of them, which makes this operation a bit tricky.

I can do it this way.

y <- setNames(numeric(length(x)), names(x))
z <- cumsum(na.omit(x))
y[names(y) %in% names(z)] <- z
y[!names(y) %in% names(z)] <- x[is.na(x)]
y
# a  b  c  d  e  f  g  h 
# 1  3  3 NA  7 NA NA 13 

But this seems excessive, and makes a lot of new assignments/copies. I'm sure there's a better way.

What better methods are there to return the cumulative sum while effectively ignoring NA values?

like image 966
Rich Scriven Avatar asked Aug 29 '14 21:08

Rich Scriven


3 Answers

You can do this in one line with:

cumsum(ifelse(is.na(x), 0, x)) + x*0
#  a  b  c  d  e  f  g  h 
#  1  3  3 NA  7 NA NA 13

Or, similarly:

library(dplyr)
cumsum(coalesce(x, 0)) + x*0
#  a  b  c  d  e  f  g  h 
#  1  3  3 NA  7 NA NA 13 
like image 95
josliber Avatar answered Oct 27 '22 15:10

josliber


It's an old question but tidyr gives a new solution. Based on the idea of replacing NA with zero.

require(tidyr)

cumsum(replace_na(x, 0))

 a  b  c  d  e  f  g  h 
 1  3  3  3  7  7  7 13 
like image 33
DJV Avatar answered Oct 27 '22 13:10

DJV


Do you want something like this:

x2 <- x
x2[!is.na(x)] <- cumsum(x2[!is.na(x)])

x2

[edit] Alternatively, as suggested by a comment above, you can change NA's to 0's -

miss <- is.na(x)
x[miss] <- 0
cs <- cumsum(x)
cs[miss] <- NA
# cs is the requested cumsum
like image 30
lebatsnok Avatar answered Oct 27 '22 14:10

lebatsnok