How to segregate a vector/variable in two states depending upon percent increase of decrease from its previous value

Question

Say I have a vector A (or a variable A in a dataframe say df) with following values

A <- c(90L, 100L, 5L, 15L, 16L, 2L, 20L, 25L, 2L, 40L, 16L, 16L, 32L, 51L, 52L)

A
#>  [1]  90 100   5  15  16   2  20  25   2  40  16  16  32  51  52

df <- data.frame(A = A)

^{Created on 2021-05-11 by the reprex package (v2.0.0)}

Now I want to divide these values into two States say 0 and 1 based on the following criteria

By default first value will be 0 state.
If the value has dropped by more than 80% of its previous value (e.g. in third row from 100 to 5 i.e. 95% drop), its state changes to 1 from 0.
Here the difficult part arrives. All values following this dropped value will remain in 1 state until it rises 50% of that value (100) i.e. 50 again. It rises above 50 in 14th row. So value in 14th row will be state 0 again
Since last value is not a decrease by more than 80% of previous value i.e. 51 it will be same state i.e. 0.

So basically I am trying to get an output like this

     A State
1   90     0
2  100     0
3    5     1
4   15     1
5   16     1
6    2     1
7   20     1
8   25     1
9    2     1
10  40     1
11  16     1
12  16     1
13  32     1
14  51     0
15  52     0

BaseR or tidyverse approach will be doing fine for me.

Where I am stuck is actually retrieving the threshold value (100 or 50%) till 14th row as you can see it further drops by 80% again two times, once in row 6th and again row 9th.

One more test case can be

     A State
1   90     0
2  100     0
3    5     1
4   15     1
5   16     1
6    2     1
7   20     1
8   25     1
9    2     1
10  40     1
11  16     1
12  16     1
13  32     1
14  51     0
15  52     0
16  60     0
17  10     1
18  20     1
19   5     1
20  30     1
21  31     0
22  50     0
23 100     0

Explanation

enter image description here

benchmarking both answers on df <- data.frame(A = sample(1:100, 100000, T))

Unit: microseconds
       expr      min        lq       mean    median       uq      max neval
 BlueVoxe() 214547.5 271455.60 298551.084 300763.90 309823.3 499692.5   100
      Ben()      4.2      4.85      7.115      5.35      9.8     11.3   100

Ben · Accepted Answer

Anil, perhaps this might be something to help in moving forward. It isn't elegant. You can create a loop checking for changes in values, while tracking prior thresholds.

A <- c(90, 100, 5, 15, 16, 2, 20, 25, 2, 40, 16, 16, 32, 
       51, 52, 60, 10, 20, 5, 30, 31, 50, 100)

threshold <- NA
State <- 0

for (i in 2:length(A)) {
  if (State[i-1] == 0) {
    if ((A[i-1] - A[i]) > (.8 * A[i-1])) {
      threshold <- A[i-1]
      State <- c(State, 1)
    } else {
      State <- c(State, 0)
    }
  } else {
    if (A[i] > (.5 * threshold)) { 
      State <- c(State, 0)
    } else {
      State <- c(State, 1)
    }
  }
}

data.frame(A, State)

Output

     A State
1   90     0
2  100     0
3    5     1
4   15     1
5   16     1
6    2     1
7   20     1
8   25     1
9    2     1
10  40     1
11  16     1
12  16     1
13  32     1
14  51     0
15  52     0
16  60     0
17  10     1
18  20     1
19   5     1
20  30     1
21  31     0
22  50     0
23 100     0

Data

A <- c(90, 100, 5, 15, 16, 2, 20, 25, 2, 40, 16, 16, 32, 51, 52, 60, 
10, 20, 5, 30, 31, 50, 100)

wurli · Answer

Here is a function using purrr::accumulate() that should do the trick:

library(dplyr)
library(purrr)

A <- c(90L, 100L, 5L, 15L, 16L, 2L, 20L, 25L, 2L, 40L, 16L, 16L, 32L, 51L, 52L)
df <- data.frame(A = A)

trans <- function(x, trigger_0 = 0.5, trigger_1 = -0.8) {
  
  # This variable 'remembers' the last value that switched 
  # the indicator to 1
  compare <- 0
  
  out <- purrr::accumulate(seq_along(x), .init = 0, function(prev_result, i) {
    
    # Initial value should be 0 - this will return .init
    if (i == 1) {
      return(0)
    }
    
    # If previous result is 0 we only need to check that change from
    # prev value is less than `trigger_1`
    if (prev_result == 0) {
      
      # Compute the change from previous value
      change1 <- (x[i] - x[i - 1]) / x[i - 1]
      
      if (change1 < trigger_1) {
        
        # Reset 'compare' to be used in next iterations
        compare <<- x[i - 1]
        return(1)
        
      } else return(0)
      
    }
    
    if (prev_result == 1) {
      
      # Compute change from 'compare'
      change2 <- x[i] / compare
      
      # Return 1 or 0 based on the increase/decrease from 'compare'
      if (change2 > trigger_0) {
        return(0)
      } else return(1)
    }
    
  })
  
  # Remove the leading 0 created by using `init`
  tail(out, -1)
   
}

df %>% 
  mutate(Indicator = trans(A))
#>      A Indicator
#> 1   90         0
#> 2  100         0
#> 3    5         1
#> 4   15         1
#> 5   16         1
#> 6    2         1
#> 7   20         1
#> 8   25         1
#> 9    2         1
#> 10  40         1
#> 11  16         1
#> 12  16         1
#> 13  32         1
#> 14  51         0
#> 15  52         0

How to segregate a vector/variable in two states depending upon percent increase of decrease from its previous value

Tags:

r

purrr

rolling-computation

tidyverse

AnilGoyal

2 Answers

Ben

wurli

Recent Activity

Donate For Us

How to segregate a vector/variable in two states depending upon percent increase of decrease from its previous value

Tags:

r

purrr

rolling-computation

tidyverse

AnilGoyal

2 Answers

Ben

wurli

Related questions

Recent Activity

Donate For Us