I am trying to create a new variable which is a function of previous rows and columns. I have found the lag() function in dplyr but it can't accomplish exactly what I would like.
library(dplyr)
x = data.frame(replicate(2, sample(1:3,10,rep=TRUE)))
X1 X2
1 1 3
2 2 3
3 2 2
4 1 3
5 2 3
6 2 1
7 3 2
8 1 1
9 1 3
10 2 2
x = mutate(x, new_col = # if x2==1, then the value of x1 in the previous row,
# if x2!=1, then 0))
My best attempt:
foo = function(x){
if (x==1){
return(lag(X1))
}else{
return(0)
}
x = mutate(x, new_col=foo(X1))
Overview of SQL Server LAG() function In other words, by using the LAG() function, from the current row, you can access data of the previous row, or the row before the previous row, and so on. The LAG() function can be very useful for comparing the value of the current row with the value of the previous row.
Here we use the Lag() function to get data from previous rows based on an offset value. SQL Server 2012 onwards, it's a window function.
1) You can use MAX or MIN along with OVER clause and add extra condition to it. The extra condition is "ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING" which will fetch previous row value. Check this: SELECT *,MIN(JoiningDate) OVER (ORDER BY JoiningDate ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS EndDate.
You can use LOOKUP() or PREVIOUS_VALUE() in a calculated field to use or show values. Ex: LOOKUP(SUM([Sales]),-1) this would provide you the previous value of column-wise.
We can use ifelse
x %>%
mutate(newcol = ifelse(X2==1, lag(X1), 0))
In base R, you can use
x$newcol <- (x$X2 == 1) * c(NA, tail(x$X1, -1))
(x$X2 == 1)
ensures 0s for all elements of X2 not equal to 1, and the multiple of the two terms will return the lagged values of X1 when X2 == 1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With