I have the following data table:
dt <- fread("
ID | EO_1 | EO_2 | EO_3 | GROUP
ID_001 | 0.5 | 1.2 | | A
ID_002 | | | | A
ID_003 | | | | A
ID_004 | | | | A
ID_001 | 0.4 | 2.5 | | B
ID_002 | | | | B
ID_003 | | | | B
ID_004 | | | | B
",
sep = "|",
colClasses = c("character", "numeric", "numeric", "numeric", "character"))
and I'm trying to perform some row-wise operations, which sometimes depend on data from previous rows. More specifically:
calc_EO_1 <- function(
EO_1,
EO_2
){
EO_1 <- shift(EO_1, type = "lag") * shift(EO_2, type = "lag")
return(EO_1)
}
calc_EO_2 <- function(
EO_1,
EO_2,
EO_3
){
EO_2 <- EO_1 * shift(EO_2, type = "lag") * shift(EO_3, type = "lag")
return(EO_2)
}
calc_EO_3 <- function(
EO_1,
EO_2
){
EO_3 <- EO_1 * EO_2
return(EO_3)
}
The last one would need to be calculated from the first row since it depends on the other fields (that should be easy) and, after that, all three operations would have to take place consecutively and row-wise.
The closest I've been has been the following:
first_row_bygroup_index <- dt[, .I[1], by = GROUP]$V1
dt[first_row_bygroup_index,
EO_3 := calc_EO_3(EO_1, EO_2)
]
dt[!first_row_bygroup_index,
`:=` (
EO_1 = calc_EO_1(EO_1, EO_2),
EO_2 = calc_EO_2(EO_1, EO_2, EO_3),
EO_3 = calc_EO_3(EO_1, EO_2)
),
by = row.names(dt[!first_row_bygroup_index])]
but it only calculates the first row properly:
ID | EO_1 | EO_2 | EO_3 | GROUP
ID_001 | 0.5 | 1.2 | 0.6 | A
ID_002 | | | | A
ID_003 | | | | A
ID_004 | | | | A
ID_001 | 0.4 | 2.5 | 1.0 | B
ID_002 | | | | B
ID_003 | | | | B
ID_004 | | | | B
Being those spaces NAs.
I don't think I'm too far away from the solution, but I'm not able to find a way to make it work. The problem is that I can't perform operations in subsets of rows using rows from outside the subset.
EDIT I missed the expected result:
ID | EO_1 | EO_2 | EO_3 | GROUP
ID_001 | 0.50000000 | 1.20000000 | 0.60000000 | A
ID_002 | 0.60000000 | 0.43200000 | 0.25920000 | A
ID_003 | 0.25920000 | 0.02902376 | 0.00752296 | A
ID_004 | 0.00752296 | 0.00000164 | 0.00000001 | A
ID_001 | 0.40000000 | 2.50000000 | 1.00000000 | B
ID_002 | 1.00000000 | 2.50000000 | 2.50000000 | B
ID_003 | 2.50000000 | 15.62500000 | 39.06250000 | B
ID_004 | 39.06250000 | 23841.8580000 | 931322.57810000 | B
NEW EDIT I came up with the following snippet, but I would rather wait a bit to see if someone can get a more efficient solution than this one:
while(any(is.na(dt))){
dt[, `:=` (
EO_3 = calc_EO_3(EO_1, EO_2),
EO_1 = ifelse(ID == "ID_001", EO_1, calc_EO_1(EO_1, EO_2)),
EO_2 = ifelse(ID == "ID_001", EO_2, calc_EO_2(EO_1, EO_2, EO_3))
)]
}
I've come up with a similar dplyr solution, with that ugly while-loop fix as well. The key would be to find a way to make a rowwise calculation that could get info from the row before, even though that row before would outside of the subset selected. I hope someone can improve this, so I'll wait a little bit before marking it as a solution.
Here is another possible approach:
dt[!is.na(EO_1), EO_3 := EO_1 * EO_2, by=.(GROUP)]
dt[ID!="ID_001", c("EO_1", "EO_2", "EO_3") :=
dt[,
{
eo1 <- EO_1[1L]; eo2 <- EO_2[1L]; eo3 <- EO_3[1L]
.SD[ID!="ID_001",
{
eo1 <- eo1 * eo2
eo2 <- eo1 * eo2 * eo3
eo3 <- eo1 * eo2
.(eo1, eo2, eo3)
},
by=.(ID)]
},
by=.(GROUP)][, -1L:-2L]
]
output:
ID EO_1 EO_2 EO_3 GROUP
1: ID_001 0.50000000 1.200000e+00 6.000000e-01 A
2: ID_002 0.60000000 4.320000e-01 2.592000e-01 A
3: ID_003 0.25920000 2.902376e-02 7.522960e-03 A
4: ID_004 0.00752296 1.642598e-06 1.235720e-08 A
5: ID_001 0.40000000 2.500000e+00 1.000000e+00 B
6: ID_002 1.00000000 2.500000e+00 2.500000e+00 B
7: ID_003 2.50000000 1.562500e+01 3.906250e+01 B
8: ID_004 39.06250000 2.384186e+04 9.313226e+05 B
Is this the kind of data you'd expect the end product to look like?
go <- function(x, y, n) {
z <- x * y
for (i in 1:(n - 1)) {
x <- c(x[1] * y[1], x)
y <- c(x[1] * y[1] * z[1], y)
z <- x * y
}
data.table(EO_1 = x, EO_2 = y, EO_3 = z)[.N:1][, lapply(.SD, round, 8)]
}
go(.5, 1.2, 4)
EO_1 EO_2 EO_3
1: 0.50000000 1.20000000 0.60000000
2: 0.60000000 0.43200000 0.25920000
3: 0.25920000 0.02902376 0.00752296
4: 0.00752296 0.00000164 0.00000001
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With