I am looking for a way to, within id and groups, create an index on 100 using the lag (or is it lead) of <code>value</code> and the new index number <code>idx_value</code> to calculate the next index number. <pre class="prettyprint"><code># install.packages(c("tidyverse"), dependencies = TRUE) library(tibble) library(magrittr) </code></pre> Like, I have this dataframe: <pre class="prettyprint"><code>start_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L, 7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12, -4, 2, 1, -3, 2, -1, -4, -2)), row.names = c(NA, -14L), class = c("tbl_df", "tbl", "data.frame")) start_tbl # A tibble: 14 x 4 id grp year value <int> <int> <int> <dbl> 1 1 1 7 2 2 1 1 8 -7 3 1 1 9 -2.3 4 1 1 10 1.1 5 1 2 7 -1 6 1 2 8 -12 7 1 2 9 -4 8 1 2 10 2 9 2 1 7 1 10 2 1 8 -3 11 2 1 9 2 12 2 2 7 -1 13 2 2 8 -4 14 2 2 9 -2 </code></pre> now I want to take id 1 grp 1 and make the index, then calculate id 1 grp 1 year 7 as 100*(1+-7/100) = 93.0, next use that result, 93, to calculate the next year: 93*(1+-2.3/100)= 90.861, and so forth. Restarting on all index years, which is a new id and a new grp and base year 7. I am quite close with: <pre class="prettyprint"><code>tbl %>% group_by(id) %>% mutate(idx_value = value-lag(value), idx_value = 100*(1+value/100) ) # A tibble: 14 x 5 # Groups: id [2] id grp year value idx_value <int> <int> <int> <dbl> <dbl> 1 1 1 7 2 102 2 1 1 8 -7 93 3 1 1 9 -2.3 97.7 4 1 1 10 1.1 101. 5 1 2 7 -1 99 6 1 2 8 -12 88 7 1 2 9 -4 96 8 1 2 10 2 102 9 2 1 7 1 101 10 2 1 8 -3 97 11 2 1 9 2 102 12 2 2 7 -1 99 13 2 2 8 -4 96 14 2 2 9 -2 98 </code></pre> but what I am trying to get to is: <pre class="prettyprint"><code>end_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L, 7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12, -4, 2, 1, -3, 2, -1, -4, -2), idx_value = c(100L, 93L, 91L, 92L, 100L, 88L, 84L, 86L, 100L, 97L, 99L, 100L, 96L, 94L)), row.names = c(NA, -14L), class = c("tbl_df", "tbl", "data.frame")) end_tbl # A tibble: 14 x 5 id grp year value idx_value <int> <int> <int> <dbl> <int> 1 1 1 7 2 100 2 1 1 8 -7 93 3 1 1 9 -2.3 91 4 1 1 10 1.1 92 5 1 2 7 -1 100 6 1 2 8 -12 88 7 1 2 9 -4 84 8 1 2 10 2 86 9 2 1 7 1 100 10 2 1 8 -3 97 11 2 1 9 2 99 12 2 2 7 -1 100 13 2 2 8 -4 96 14 2 2 9 -2 94 </code></pre> Any help with be appreciated. Maybe the answer is here. <h3>small additional small example data <code>start_tbl2</code> to illustrate the issue. If I use a starting tibble like <code>start_tbl2</code> below</h3> <pre class="prettyprint"><code> start_tbl2 <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L), value = c(2, -12, -18.3, 100, 15, 30, 40, -50)), row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame")) library(dplyr) start_tbl2 %>% group_by(id, grp) %>% mutate(idx_value = c(100, round(100 * (1 + cumsum(value[-1])/100)))) # A tibble: 8 x 5 # Groups: id, grp [2] id grp year value idx_value <int> <int> <int> <dbl> <dbl> 1 1 1 7 2 100 2 1 1 8 -12 88 3 1 1 9 -18.3 70 4 1 1 10 100 170 5 1 2 7 15 100 6 1 2 8 30 130 7 1 2 9 40 170 8 1 2 10 -50 120 </code></pre> Whereas I get this when I calculate it by hand: <pre class="prettyprint"><code>Percentage_change cal_by_hand cumsum diff 2 100 100 0 -12 88 88 0 -18.3 71.896 70 1.896 100 143.792 170 -26.208 15 100 100 0 30 130 130 0 40 182 170 12 -50 91 120 -29 </code></pre>

Based on the new dataset <pre class="prettyprint"><code>library(purrr) library(dplyr) start_tbl2 %>% group_by(id, grp) %>% mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 )) # A tibble: 8 x 5 # Groups: id, grp [2] # id grp year value idx_vlue # <int> <int> <int> <dbl> <dbl> #1 1 1 7 2 100 #2 1 1 8 -12 88 #3 1 1 9 -18.3 71.9 #4 1 1 10 100 144. #5 1 2 7 15 100 #6 1 2 8 30 130 #7 1 2 9 40 182 #8 1 2 10 -50 91 </code></pre> and using 'start_tbl <pre class="prettyprint"><code>start_tbl %>% group_by(id, grp) %>% mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 )) # A tibble: 14 x 5 # Groups: id, grp [4] # id grp year value idx_vlue # <int> <int> <int> <dbl> <dbl> # 1 1 1 7 2 100 # 2 1 1 8 -7 93 # 3 1 1 9 -2.3 90.9 # 4 1 1 10 1.1 91.9 # 5 1 2 7 -1 100 # 6 1 2 8 -12 88 # 7 1 2 9 -4 84.5 # 8 1 2 10 2 86.2 # 9 2 1 7 1 100 #10 2 1 8 -3 97 #11 2 1 9 2 98.9 #12 2 2 7 -1 100 #13 2 2 8 -4 96 #14 2 2 9 -2 94.1 </code></pre>

calculate indices with base year and relative percentage change

Tags:

r

dplyr

percentage

tibble

I am looking for a way to, within id and groups, create an index on 100 using the lag (or is it lead) of value and the new index number idx_value to calculate the next index number.

# install.packages(c("tidyverse"), dependencies = TRUE)
library(tibble)
library(magrittr)

Like, I have this dataframe:

start_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L, 
7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12, 
-4, 2, 1, -3, 2, -1, -4, -2)), row.names = c(NA, -14L), class = c("tbl_df", 
"tbl", "data.frame"))
start_tbl
# A tibble: 14 x 4
      id   grp  year value
   <int> <int> <int> <dbl>
 1     1     1     7   2  
 2     1     1     8  -7  
 3     1     1     9  -2.3
 4     1     1    10   1.1
 5     1     2     7  -1  
 6     1     2     8 -12  
 7     1     2     9  -4  
 8     1     2    10   2  
 9     2     1     7   1  
10     2     1     8  -3  
11     2     1     9   2  
12     2     2     7  -1  
13     2     2     8  -4  
14     2     2     9  -2

now I want to take id 1 grp 1 and make the index, then calculate id 1 grp 1 year 7 as 100*(1+-7/100) = 93.0, next use that result, 93, to calculate the next year: 93*(1+-2.3/100)= 90.861, and so forth. Restarting on all index years, which is a new id and a new grp and base year 7.

I am quite close with:

tbl %>% group_by(id) %>% mutate(idx_value = value-lag(value), idx_value = 100*(1+value/100) )
# A tibble: 14 x 5
# Groups:   id [2]
      id   grp  year value idx_value
   <int> <int> <int> <dbl>     <dbl>
 1     1     1     7   2       102  
 2     1     1     8  -7        93  
 3     1     1     9  -2.3      97.7
 4     1     1    10   1.1     101. 
 5     1     2     7  -1        99  
 6     1     2     8 -12        88  
 7     1     2     9  -4        96  
 8     1     2    10   2       102  
 9     2     1     7   1       101  
10     2     1     8  -3        97  
11     2     1     9   2       102  
12     2     2     7  -1        99  
13     2     2     8  -4        96  
14     2     2     9  -2        98

but what I am trying to get to is:

end_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L, 
7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12, 
-4, 2, 1, -3, 2, -1, -4, -2), idx_value = c(100L, 93L, 91L, 92L, 
100L, 88L, 84L, 86L, 100L, 97L, 99L, 100L, 96L, 94L)), row.names = c(NA, 
-14L), class = c("tbl_df", "tbl", "data.frame"))
end_tbl
# A tibble: 14 x 5
      id   grp  year value idx_value
   <int> <int> <int> <dbl>     <int>
 1     1     1     7   2         100
 2     1     1     8  -7          93
 3     1     1     9  -2.3        91
 4     1     1    10   1.1        92
 5     1     2     7  -1         100
 6     1     2     8 -12          88
 7     1     2     9  -4          84
 8     1     2    10   2          86
 9     2     1     7   1         100
10     2     1     8  -3          97
11     2     1     9   2          99
12     2     2     7  -1         100
13     2     2     8  -4          96
14     2     2     9  -2          94

Any help with be appreciated. Maybe the answer is here.

small additional small example data `start_tbl2` to illustrate the issue. If I use a starting tibble like `start_tbl2` below

    start_tbl2 <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L), 
value = c(2, -12, -18.3, 100, 15, 30, 40, -50)), 
row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))

library(dplyr)
start_tbl2 %>%
   group_by(id, grp) %>% 
   mutate(idx_value = c(100, round(100 * (1 + cumsum(value[-1])/100))))
# A tibble: 8 x 5
# Groups:   id, grp [2]
     id   grp  year value idx_value
  <int> <int> <int> <dbl>     <dbl>
1     1     1     7   2         100
2     1     1     8 -12          88
3     1     1     9 -18.3        70
4     1     1    10 100         170
5     1     2     7  15         100
6     1     2     8  30         130
7     1     2     9  40         170
8     1     2    10 -50         120

Whereas I get this when I calculate it by hand:

Percentage_change   cal_by_hand cumsum  diff
2                   100         100     0
-12                 88          88      0
-18.3               71.896      70      1.896
100                 143.792     170     -26.208
15                  100         100     0
30                  130         130     0
40                  182         170     12
-50                 91          120     -29

423

asked May 14 '20 19:05

Eric Fail

2 Answers

Another way would be to use cumprod() after converting the values to percentages:

library(dplyr)

start_tbl %>%
  group_by(id, grp) %>%
  mutate(idx_value = cumprod(c(100, (100 + value[-1]) / 100))) 

# A tibble: 14 x 5
# Groups:   id, grp [4]
      id   grp  year value idx_value
   <int> <int> <int> <dbl>     <dbl>
 1     1     1     7   2       100  
 2     1     1     8  -7        93  
 3     1     1     9  -2.3      90.9
 4     1     1    10   1.1      91.9
 5     1     2     7  -1       100  
 6     1     2     8 -12        88  
 7     1     2     9  -4        84.5
 8     1     2    10   2        86.2
 9     2     1     7   1       100  
10     2     1     8  -3        97  
11     2     1     9   2        98.9
12     2     2     7  -1       100  
13     2     2     8  -4        96  
14     2     2     9  -2        94.1

128

answered Sep 19 '22 21:09

Ritchie Sacramento

Based on the new dataset

library(purrr)
library(dplyr)
start_tbl2 %>%
      group_by(id, grp) %>%
      mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 ))
# A tibble: 8 x 5
# Groups:   id, grp [2]
#     id   grp  year value idx_vlue
#  <int> <int> <int> <dbl>    <dbl>
#1     1     1     7   2      100  
#2     1     1     8 -12       88  
#3     1     1     9 -18.3     71.9
#4     1     1    10 100      144. 
#5     1     2     7  15      100  
#6     1     2     8  30      130  
#7     1     2     9  40      182  
#8     1     2    10 -50       91

and using 'start_tbl

start_tbl %>%
     group_by(id, grp) %>%
     mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 ))
# A tibble: 14 x 5
# Groups:   id, grp [4]
#      id   grp  year value idx_vlue
#   <int> <int> <int> <dbl>    <dbl>
# 1     1     1     7   2      100  
# 2     1     1     8  -7       93  
# 3     1     1     9  -2.3     90.9
# 4     1     1    10   1.1     91.9
# 5     1     2     7  -1      100  
# 6     1     2     8 -12       88  
# 7     1     2     9  -4       84.5
# 8     1     2    10   2       86.2
# 9     2     1     7   1      100  
#10     2     1     8  -3       97  
#11     2     1     9   2       98.9
#12     2     2     7  -1      100  
#13     2     2     8  -4       96  
#14     2     2     9  -2       94.1

answered Sep 20 '22 21:09

akrun

Related questions
                            
                                Extracting standard errors from random effects of class GAMM in r
                            
                                Spread with duplicate identifiers for rows [duplicate]
                            
                                kable kableExtra, Cells with hyperlinks
                            
                                R Markdown Error: 'is_latex_output' is not an exported object from 'namespace:knitr' [duplicate]
                            
                                Caret method = "rf" warning message: invalid ## mtry: reset to within valid range
                            
                                Is it possible to draw arrow from node to nothing?
                            
                                Fonts for Rmarkdown document
                            
                                Create Multilines from Points, grouped by ID with sf package
                            
                                R: dplyr::lag throws error when trying to lag characters in tibble
                            
                                How to add a LaTeX symbol to fig.cap in R markdown?
                            
                                How to define key argument of gather function using string concatenation
                            
                                Multi-line legend text including exponent with ggplot
                            
                                Is it possible to send data to a shiny app?
                            
                                Why is geom_text() plotting the text several times?
                            
                                Extending ggplot2 with a custom geometry for sf objects
                            
                                Non English characters in ggplot within a knitr document
                            
                                Is it possible to add vertical lines to tables produced with R knitr::kable in pdf?
                            
                                summarize for all other values per group in dplyr
                            
                                Can't build RDCOMClient using rtools40 and R 4.0
                            
                                Number of combinations less than 100

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

calculate indices with base year and relative percentage change

Tags:

r

dplyr

percentage

tibble

small additional small example data `start_tbl2` to illustrate the issue. If I use a starting tibble like `start_tbl2` below

Eric Fail

People also ask

2 Answers

Ritchie Sacramento

akrun

Recent Activity

Donate For Us

calculate indices with base year and relative percentage change

Tags:

r

dplyr

percentage

tibble

small additional small example data start_tbl2 to illustrate the issue. If I use a starting tibble like start_tbl2 below

Eric Fail

People also ask

2 Answers

Ritchie Sacramento

akrun

Related questions

Recent Activity

Donate For Us

small additional small example data `start_tbl2` to illustrate the issue. If I use a starting tibble like `start_tbl2` below