I am looking for a way to, within id and groups, create an index on 100 using the lag (or is it lead) of value
and the new index number idx_value
to calculate the next index number.
# install.packages(c("tidyverse"), dependencies = TRUE)
library(tibble)
library(magrittr)
Like, I have this dataframe:
start_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L,
7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12,
-4, 2, 1, -3, 2, -1, -4, -2)), row.names = c(NA, -14L), class = c("tbl_df",
"tbl", "data.frame"))
start_tbl
# A tibble: 14 x 4
id grp year value
<int> <int> <int> <dbl>
1 1 1 7 2
2 1 1 8 -7
3 1 1 9 -2.3
4 1 1 10 1.1
5 1 2 7 -1
6 1 2 8 -12
7 1 2 9 -4
8 1 2 10 2
9 2 1 7 1
10 2 1 8 -3
11 2 1 9 2
12 2 2 7 -1
13 2 2 8 -4
14 2 2 9 -2
now I want to take id 1 grp 1 and make the index, then calculate id 1 grp 1 year 7 as 100*(1+-7/100) = 93.0, next use that result, 93, to calculate the next year: 93*(1+-2.3/100)= 90.861, and so forth. Restarting on all index years, which is a new id and a new grp and base year 7.
I am quite close with:
tbl %>% group_by(id) %>% mutate(idx_value = value-lag(value), idx_value = 100*(1+value/100) )
# A tibble: 14 x 5
# Groups: id [2]
id grp year value idx_value
<int> <int> <int> <dbl> <dbl>
1 1 1 7 2 102
2 1 1 8 -7 93
3 1 1 9 -2.3 97.7
4 1 1 10 1.1 101.
5 1 2 7 -1 99
6 1 2 8 -12 88
7 1 2 9 -4 96
8 1 2 10 2 102
9 2 1 7 1 101
10 2 1 8 -3 97
11 2 1 9 2 102
12 2 2 7 -1 99
13 2 2 8 -4 96
14 2 2 9 -2 98
but what I am trying to get to is:
end_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L,
7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12,
-4, 2, 1, -3, 2, -1, -4, -2), idx_value = c(100L, 93L, 91L, 92L,
100L, 88L, 84L, 86L, 100L, 97L, 99L, 100L, 96L, 94L)), row.names = c(NA,
-14L), class = c("tbl_df", "tbl", "data.frame"))
end_tbl
# A tibble: 14 x 5
id grp year value idx_value
<int> <int> <int> <dbl> <int>
1 1 1 7 2 100
2 1 1 8 -7 93
3 1 1 9 -2.3 91
4 1 1 10 1.1 92
5 1 2 7 -1 100
6 1 2 8 -12 88
7 1 2 9 -4 84
8 1 2 10 2 86
9 2 1 7 1 100
10 2 1 8 -3 97
11 2 1 9 2 99
12 2 2 7 -1 100
13 2 2 8 -4 96
14 2 2 9 -2 94
Any help with be appreciated. Maybe the answer is here.
start_tbl2
to illustrate the issue. If I use a starting tibble like start_tbl2
below start_tbl2 <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L),
value = c(2, -12, -18.3, 100, 15, 30, 40, -50)),
row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))
library(dplyr)
start_tbl2 %>%
group_by(id, grp) %>%
mutate(idx_value = c(100, round(100 * (1 + cumsum(value[-1])/100))))
# A tibble: 8 x 5
# Groups: id, grp [2]
id grp year value idx_value
<int> <int> <int> <dbl> <dbl>
1 1 1 7 2 100
2 1 1 8 -12 88
3 1 1 9 -18.3 70
4 1 1 10 100 170
5 1 2 7 15 100
6 1 2 8 30 130
7 1 2 9 40 170
8 1 2 10 -50 120
Whereas I get this when I calculate it by hand:
Percentage_change cal_by_hand cumsum diff
2 100 100 0
-12 88 88 0
-18.3 71.896 70 1.896
100 143.792 170 -26.208
15 100 100 0
30 130 130 0
40 182 170 12
-50 91 120 -29
To calculate the Price Index, take the price of the Market Basket of the year of interest and divide by the price of the Market Basket of the base year, then multiply by 100.
How do I calculate percentage change? If you are tracking a particular stock's price increase, use the formula (New Price - Old Price)/Old Price and then multiply that number by 100. If the price decreased, use the formula (Old Price - New Price)/Old Price and multiply that number by 100.
The index is calculated by adding the stock prices of the 30 companies and then dividing by the divisor. The divisor changes when there are stock splits or dividends or when a company is added or removed from the index.
Another way would be to use cumprod()
after converting the values to percentages:
library(dplyr)
start_tbl %>%
group_by(id, grp) %>%
mutate(idx_value = cumprod(c(100, (100 + value[-1]) / 100)))
# A tibble: 14 x 5
# Groups: id, grp [4]
id grp year value idx_value
<int> <int> <int> <dbl> <dbl>
1 1 1 7 2 100
2 1 1 8 -7 93
3 1 1 9 -2.3 90.9
4 1 1 10 1.1 91.9
5 1 2 7 -1 100
6 1 2 8 -12 88
7 1 2 9 -4 84.5
8 1 2 10 2 86.2
9 2 1 7 1 100
10 2 1 8 -3 97
11 2 1 9 2 98.9
12 2 2 7 -1 100
13 2 2 8 -4 96
14 2 2 9 -2 94.1
Based on the new dataset
library(purrr)
library(dplyr)
start_tbl2 %>%
group_by(id, grp) %>%
mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 ))
# A tibble: 8 x 5
# Groups: id, grp [2]
# id grp year value idx_vlue
# <int> <int> <int> <dbl> <dbl>
#1 1 1 7 2 100
#2 1 1 8 -12 88
#3 1 1 9 -18.3 71.9
#4 1 1 10 100 144.
#5 1 2 7 15 100
#6 1 2 8 30 130
#7 1 2 9 40 182
#8 1 2 10 -50 91
and using 'start_tbl
start_tbl %>%
group_by(id, grp) %>%
mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 ))
# A tibble: 14 x 5
# Groups: id, grp [4]
# id grp year value idx_vlue
# <int> <int> <int> <dbl> <dbl>
# 1 1 1 7 2 100
# 2 1 1 8 -7 93
# 3 1 1 9 -2.3 90.9
# 4 1 1 10 1.1 91.9
# 5 1 2 7 -1 100
# 6 1 2 8 -12 88
# 7 1 2 9 -4 84.5
# 8 1 2 10 2 86.2
# 9 2 1 7 1 100
#10 2 1 8 -3 97
#11 2 1 9 2 98.9
#12 2 2 7 -1 100
#13 2 2 8 -4 96
#14 2 2 9 -2 94.1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With