R: Conditional summation of a numeric vector

Tags:

r

I have vectors that have numeric values. For example:

inVector <- c(2, -10, 5, 34, 7)

I need to transform this so that when I encounter a negative element, that negative element gets summed with subsequent elements until the element that turns the sum positive:

outVector <- c(2, 0, 0, 29, 7)

The negative elements will be made zeros so that the overall sum remains. So the elements 2 and 3 will be zero and the fourth element equals 29 = -10 + 5 + 34. I tried a for loop solution like this:

outVector <- numeric(length = length(inVector))

for(i in 1:length(inVector)) {
   outVector <- inVector
   outVector[i] <- ifelse(outVector[i] < 0, 0, outVector[i])
   outVector[i + 1] <- ifelse(outVector[i] == 0, sum(inVector[i:(i+1)]), outVector[i + 1])
   outVector <- outVector[1:length(inVector)]
   }

but that didn't work. However, I would be most interested of a solution that works in dplyr pipe as well.

856

asked Aug 23 '16 13:08

2 Answers

If we want to optimize, we can use the more efficient Reduce function to iterate through the vector:

#Help function
zeroElement <- function(vec) {
  r <- Reduce(function(x,y) if(x >= 0) y else sum(x,y), vec, acc=TRUE)
  r[r < 0] <- 0
  return(r)
}

#Use function
zeroElement(x)
#[1]  2  0  0 29  7

Speed Test: 25% faster:

t3 <- MakeNonNeg(BigVec)
t4 <- zeroElement(BigVec)
all.equal(t3, t4)
#[1] TRUE
library(microbenchmark)
microbenchmark(
  makeNonNeg = MakeNonNeg(BigVec),
  zeroElement = zeroElement(BigVec),
  times=10)
# Unit: seconds
#        expr      min       lq     mean   median       uq      max neval cld
#  makeNonNeg 2.047484 2.099289 2.195988 2.111135 2.248381 2.531009    10   b
# zeroElement 1.529257 1.580789 1.666000 1.664855 1.725528 1.837825    10  a

Add session info for comparison:

sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

answered Oct 06 '22 20:10

Pierre L

Try this:

MakeNonNeg <- function(v) {
    size <- length(v)
    myOut <- as.numeric(v)
    if (size > 1L) {
        for (i in 1:(size-1L)) {
            if (myOut[i] >= 0) {next}
            myOut[i+1L] <- myOut[i]+myOut[i+1L]
            myOut[i] <- 0
        }
    }
    myOut
}

MakeNonNeg(inVector)
[1]  2  0  0 29  7

Below is a more exotic example:

set.seed(4242)

BigVec <- sample(-40000:100000, 100000, replace = TRUE)
gmp::sum.bigz(BigVec)
Big Integer ('bigz') :
    [1] 2997861106

t3 <- MakeNonNeg(BigVec)
gmp::sum.bigz(t3)
Big Integer ('bigz') :
    [1] 2997861106

BigVec[1:20]
[1]  98056   8680  -7814  53620  58390  90832  74970 -16392  52648  83779 -17229  38484 -36589  75156  71200  95968 -11599  57705
[19]  19209 -21596

t3[1:20]
[1] 98056  8680     0 45806 58390 90832 74970     0 36256 83779     0 21255     0 38567 71200 95968     0 46106 19209     0

Here is my system info:

sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Below are timings for both functions with JIT disabled.

microbenchmark(
    makeNonNeg = MakeNonNeg(BigVec),
    zeroElement = zeroElement(BigVec),
    times=10)
Unit: milliseconds
       expr      min       lq     mean   median       uq      max neval
 makeNonNeg 254.1255 255.8430 267.9527 258.6369 277.0222 303.6516    10
zeroElement 152.0358 164.7988 175.3191 166.4948 198.3855 209.8739    10

With JIT enabled, we obtain much different results for makeNonNeg. However, the results for zeroElement don't change that much (I'm thinking that since Reduce is the major part of the function and it is already bytecoded, there is not much room for improvement).

library(compiler)
enableJIT(3)
[1] 0

microbenchmark(
    makeNonNeg = MakeNonNeg(BigVec),
    zeroElement = zeroElement(BigVec),
    times=10)
Unit: milliseconds
       expr       min        lq      mean    median        uq       max neval
 makeNonNeg  11.20514  11.55366  12.76953  11.84655  12.20554  20.60036    10
zeroElement 144.15123 149.33591 163.66421 157.34711 176.20139 198.57268    10

So, with JIT disabled, zeroElement is about 50% faster and when JIT is enabled, MakeNonNeg is about 13x faster.

answered Oct 06 '22 18:10

Joseph Wood

Related questions
                            
                                Create Fillable PDF Textbox via R
                            
                                How to get list name and slice name with pipe and purrr
                            
                                How to extend ggplot2 boxplot with ggproto?
                            
                                R Shiny execute order
                            
                                ggplot2: How to add percentage labels to a donut chart
                            
                                Invert a List of Character Vectors in R
                            
                                When to use a for loop in R [closed]
                            
                                How to install R-packages not in the conda repositories?
                            
                                How to tune multiple parameters using Caret package?
                            
                                Dealing with Eastern Standard Time (EST) and Eastern Daylight Savings (EDT) in R
                            
                                R get AUC and plot multiple ROC curves together at the same time
                            
                                Data Table to nested list
                            
                                How to get row and column number of cell that matches a given value in data.frame or distance matrix?
                            
                                robust and clustered standard error in R for probit and logit regression
                            
                                Insert row at a specific location in matrix using R
                            
                                Determine and set timezone in POSIXct, POSIXlt, strptime, etc. in R
                            
                                R Shiny DT::renderDataTable stuck with an overlaid "Processing..." banner
                            
                                Merge two data frames with all combinations
                            
                                Finding non-linear correlations in R
                            
                                How to compare with values adjacent in a sequence in the same group

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R: Conditional summation of a numeric vector

Tags:

loops

r

Antti

People also ask

2 Answers

Pierre L

Joseph Wood

Recent Activity

Donate For Us