<p>First question here after some searching and scrolling I'm still stuck</p> <p>I have a big vector that should always be increasing but it sometimes reset to 0. I'd like that everytime it resets to 0 the previous non-0 value gets added to the following values. I've tried LOCF but it doesn't work as it only fills my 0 values with the previous values and then goes back to the lowest value.</p> <p>Vector example:</p> <div class="s-table-container"> <table class="s-table"> <thead><tr> <th>Data</th> <th>Desired transformation</th> </tr></thead> <tbody> <tr> <td>0</td> <td>0</td> </tr> <tr> <td>0</td> <td>0</td> </tr> <tr> <td>1</td> <td>1</td> </tr> <tr> <td>2</td> <td>2</td> </tr> <tr> <td>3</td> <td>3</td> </tr> <tr> <td>5</td> <td>5</td> </tr> <tr> <td>6</td> <td>6</td> </tr> <tr> <td>0</td> <td>6</td> </tr> <tr> <td>0</td> <td>6</td> </tr> <tr> <td>1</td> <td>7</td> </tr> <tr> <td>2</td> <td>8</td> </tr> </tbody> </table> </div>

<p>Perhaps you can try <code>cumsum</code> + <code>rle</code> like below</p> <pre class="prettyprint"><code>v <- df$Data idx <- with( rle(v == 0), cumsum(lengths)[values] - 1 ) df$DataOut <- v + cumsum(replace(rep(0, length(v)), idx, v[pmax(1, idx - 1)])) </code></pre> <p>which gives</p> <pre class="prettyprint"><code>> df # A tibble: 11 x 2 Data DataOut <dbl> <dbl> 1 0 0 2 0 0 3 1 1 4 2 2 5 3 3 6 5 5 7 6 6 8 0 6 9 0 6 10 1 7 11 2 8 </code></pre>

<p>I think this will also do ( I haven't removed dummy column <code>d</code> for better understanding that what's actually happening here)</p> <pre class="prettyprint"><code>df %>% mutate(d = c(0, diff(Data)), out = cumsum(pmax(-1 *Data, d))) Data d out <dbl> <dbl> <dbl> 1 0 0 0 2 0 0 0 3 1 1 1 4 2 1 2 5 3 1 3 6 5 2 5 7 6 1 6 8 0 -6 6 9 0 0 6 10 1 1 7 11 2 1 8 </code></pre> <p>Once you understand, you can simply do</p> <pre class="prettyprint"><code>df %>% mutate(out = cumsum(pmax(-1 *Data, c(0, diff(Data))))) # A tibble: 11 x 2 Data out <dbl> <dbl> 1 0 0 2 0 0 3 1 1 4 2 2 5 3 3 6 5 5 7 6 6 8 0 6 9 0 6 10 1 7 11 2 8 </code></pre>

<p>I believe there are much better ways than a <code>for</code> loop for your question but I believe this is quite stable and leads to your desired output. I used to be a big fan of <code>for</code> loops and whenever I need a solution that requires more flexibility I do not hesitate to use them. In your case this was the first solution that comes to my mind.</p> <pre class="prettyprint"><code>out <- vector("numeric", length = nrow(df)) for(i in 2:nrow(df)) { out[[1]] <- df$Data[[1]] out[[i]] <- out[[i-1]] + (df$Data[[i]] - df$Data[[i-1]]) if(df$Data[[i]] == 0 & df$Data[[i-1]] != 0) { out[[i]] <- out[[i-1]] } } cbind(df, out) Data out 1 0 0 2 0 0 3 1 1 4 2 2 5 3 3 6 5 5 7 6 6 8 0 6 9 0 6 10 1 7 11 2 8 </code></pre> <p><strong>Data</strong></p> <pre class="prettyprint"><code>df <- tibble( Data = c(0, 0, 1, 2, 3, 5, 6, 0, 0, 1, 2) ) </code></pre>

Last observation added forward

Tags:

r

First question here after some searching and scrolling I'm still stuck

I have a big vector that should always be increasing but it sometimes reset to 0. I'd like that everytime it resets to 0 the previous non-0 value gets added to the following values. I've tried LOCF but it doesn't work as it only fills my 0 values with the previous values and then goes back to the lowest value.

Vector example:

Data	Desired transformation
0	0
0	0
1	1
2	2
3	3
5	5
6	6
0	6
0	6
1	7
2	8

442

asked Apr 30 '21 10:04

fxpadi

Video Answer

4 Answers

Perhaps you can try cumsum + rle like below

v <- df$Data
idx <- with(
  rle(v == 0),
  cumsum(lengths)[values] - 1
)
df$DataOut <- v + cumsum(replace(rep(0, length(v)), idx, v[pmax(1, idx - 1)]))

which gives

> df
# A tibble: 11 x 2
    Data DataOut
   <dbl>   <dbl>
 1     0       0
 2     0       0
 3     1       1
 4     2       2
 5     3       3
 6     5       5
 7     6       6
 8     0       6
 9     0       6
10     1       7
11     2       8

160

answered Oct 18 '22 21:10

ThomasIsCoding

I think this will also do ( I haven't removed dummy column d for better understanding that what's actually happening here)

df %>% mutate(d = c(0, diff(Data)),
              out = cumsum(pmax(-1 *Data, d)))

    Data     d   out
   <dbl> <dbl> <dbl>
 1     0     0     0
 2     0     0     0
 3     1     1     1
 4     2     1     2
 5     3     1     3
 6     5     2     5
 7     6     1     6
 8     0    -6     6
 9     0     0     6
10     1     1     7
11     2     1     8

Once you understand, you can simply do

df %>% mutate(out = cumsum(pmax(-1 *Data, c(0, diff(Data)))))

# A tibble: 11 x 2
    Data   out
   <dbl> <dbl>
 1     0     0
 2     0     0
 3     1     1
 4     2     2
 5     3     3
 6     5     5
 7     6     6
 8     0     6
 9     0     6
10     1     7
11     2     8

answered Oct 18 '22 19:10

AnilGoyal

I believe there are much better ways than a for loop for your question but I believe this is quite stable and leads to your desired output. I used to be a big fan of for loops and whenever I need a solution that requires more flexibility I do not hesitate to use them. In your case this was the first solution that comes to my mind.

out <- vector("numeric", length = nrow(df))
for(i in 2:nrow(df)) {
  out[[1]] <- df$Data[[1]]
  out[[i]] <- out[[i-1]] + (df$Data[[i]] - df$Data[[i-1]])
  
  if(df$Data[[i]] == 0 & df$Data[[i-1]] != 0) {
    out[[i]] <- out[[i-1]]
  }
}

cbind(df, out)

   Data out
1     0   0
2     0   0
3     1   1
4     2   2
5     3   3
6     5   5
7     6   6
8     0   6
9     0   6
10    1   7
11    2   8

Data

df <- tibble(
  Data = c(0, 0, 1, 2, 3, 5, 6, 0, 0, 1, 2)
)

answered Oct 18 '22 19:10

Anoushiravan R

Using Tidyverse

Setup:

library(tidyverse)

(df <- tibble::tibble( Data = c(0, 0, 2, 4, 0, 0, 1, 2, 0, 1, 3)))

Actual code

(
  df
  %>% mutate(to_add = Data - lag(Data),
             to_add = ifelse(is.na(to_add) | to_add < 0, 0, to_add),
             out = cumsum(to_add))
  %>% select( ! to_add)
)

Output

# A tibble: 11 x 2
    Data   out
   <dbl> <dbl>
 1     0     0
 2     0     0
 3     2     2
 4     4     4
 5     0     4
 6     0     4
 7     1     5
 8     2     6
 9     0     6
10     1     7
11     2     8

The trick is to use the lag function which returns the value at the previous line.

Base R (works only if values are consecutive)

df <- data.frame( Data = c(0, 0, 1, 2, 0, 0, 1, 2, 0, 1, 2))

df$out <- cumsum(df$Data != 0)

output

   Data out
1     0   0
2     0   0
3     1   1
4     2   2
5     0   2
6     0   2
7     1   3
8     2   4
9     0   4
10    1   5
11    2   6

The trick is to count lines without zeros and then do cumulative sum on it see cumsum.

df$Data != 0 will return TRUE if you need to add 1 and will be converted to number 1 when using cumsum

answered Oct 18 '22 19:10

pietrodito

Related questions
                            
                                R: Calculating distance in miles from one point to another
                            
                                How to compose a list of functions
                            
                                ggplot() scaling with scale::percent_format() producing strange results
                            
                                Plot y = mx + c with ggplot
                            
                                Blogdown kable tables formatting (ugly)
                            
                                Handling empty strings in string detection
                            
                                R shiny dynamic UI in insertUI
                            
                                How to convert a numeric value into a Date value
                            
                                How to filter an R simple features collection using sf methods like st_intersects()?
                            
                                R return true or false per row if string contains any of a list of words
                            
                                How to find the number of times row elements switch from negative to positive (cycles) for each factor level
                            
                                Replacement of plyr::cbind.fill in dplyr?
                            
                                Left-adjust (hjust = 0) vertical x axis labels on facets with free scale
                            
                                How to group rows and get their cell associations layed out in a list form in r?
                            
                                How to establish if the dates in a column are unique?
                            
                                Cumulative product of (1-previous_record)*current_record
                            
                                zsh: command not found: R on terminal using Big Sur Mac
                            
                                How to identify row that matches vector
                            
                                R repeat in column based on value in row
                            
                                R: pass multiple arguments to accumulate/reduce

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Last observation added forward

Tags:

r

fxpadi

People also ask

Video Answer

4 Answers

ThomasIsCoding

AnilGoyal

Anoushiravan R

Using Tidyverse

Output

Base R (works only if values are consecutive)

output

pietrodito

Recent Activity

Donate For Us