I have this data frame
> df
# rn a b c d e f
# 1 1 2 NA NA NA NA
# 2 5 8 NA 4 5 6
# 3 8 5 4 2 3 2
# 4 4 2 5 5 6 2
I am trying to create a new column that's based on these conditions:
c to f are all NA, then bc is NA, then return the smallest value in columns b, and f i.e. min(b,f) or pminNAs exist in the row, return the least value in b, e, and f + previous calculated valueThe desired output is:
> df
# rn a b c d e f g
# 1 1 2 NA NA NA NA 2
# 2 5 8 NA 4 5 6 6 ### ? [b=8, f=6; least value = 6]
# 3 8 5 4 2 3 2 3 ### ? [b=5, e=3, 'f + previous calculated value' = 2+6=8; least value = 3]
# 4 4 2 5 5 6 2 2 ### ? [b=2, e=6, 'f + previous calculated value' = 2+3=5; least value = 2]
I have tried this but I have no idea how to access the previously calculated value (using lag(g) as a placeholder) :
df%>%
mutate(g = case_when(
is.na(c) & is.na(d) & is.na(e) & is.na(f) ~ b,
is.na(c) & !is.na(d) & !is.na(e) & !is.na(f) ~ pmin(b,f),
!is.na(c) & !is.na(d) & !is.na(e) & !is.na(f) ~ pmin(b,e, f+lag(g)),
TRUE ~ NA)
)
Maybe I am not thinking about this the right way. Any help is greatly appreciated!
In this case I would say use a simple for-loop.
You cannot use lag(g) because you haven't built g column yet.
res <- rep(0, nrow(df))
for (i in 1:nrow(df)) {
row <- df[i, ]
if (is.na(row["c"]) && is.na(row["f"])) {
res[i] <- row["b"]
} else if (is.na(row["c"])) {
res[i] <- min(row["b"], row["f"])
} else if (!is.na(row["d"])) {
res[i] <- min(row["b"], row["e"], row["f") + res[i-1])
}
}
df$g <- res
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With