I am trying to add another column to a dataframe where the new column is a function of the previous value in the new column and a current row value. I have tried to strip out irrelevant code and stick in easy numbers so that I might understand answers here. Given the following dataframe:
x
1 1
2 2
3 3
4 4
5 5
The next column (y) will add 5 to x and also add the previous row's value for y. There's no previous value for y in the first row, so I define it as 0. So the first row value for y would be x+5+0 or 1+5+0 or 6. The second row would be x+5+y(from 1st row) or 2+5+6 or 13. The dataframe should look like this:
x y
1 1 6
2 2 13
3 3 21
4 4 30
5 5 40
I tried this with case_when() and lag() functions like this:
test_df <- data.frame(x = 1:5)
test_df %>% mutate(y = case_when(x==1 ~ 6,
+ x>1 ~ x+5+lag(y)))
Error: Problem with
mutate()
columny
. ℹy = case_when(x == 1 ~ 6, x > 1 ~ x + 5 + lag(y))
. x object 'y' not found Runrlang::last_error()
to see where the error occurred.
I had thought y was defined when the first row was calculated. Is there a better way to do this? Thanks!
You don't need lag
here at all. Just a cumsum
should suffice.
test_df %>% mutate(y = cumsum(x + 5))
#> x y
#> 1 1 6
#> 2 2 13
#> 3 3 21
#> 4 4 30
#> 5 5 40
Data
test_df <- data.frame(x = 1:5)
We can also use purrr::accumulate
here:
library(purrr)
df %>% mutate(y = accumulate(x+5, ~.x + .y))
x y
1 1 6
2 2 13
3 3 21
4 4 30
5 5 40
We can also use accumulate
with regular base R synthax:
df %>% mutate(y = accumulate(x+5, function(x, y) {x + y}))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With