Starting dataframe
data_start <- data.frame(marker = c("yes","yes","no","yes","no"),
id_out = c(5,3,1,1,7),
id_new = c(6,8,9,4,2))
> data_start
marker id_out id_new
1 yes 5 6
2 yes 3 8
3 no 1 9
4 yes 1 4
5 no 7 2
Add three column headers with empty columns below. Attach the starting var1:var3
values.
data_start[,c("var1", "var2", "var3")] <- NA
vars <- c(5,3,1)
data_start[1, 4:6] <- vars
> data_start
marker id_out id_new var1 var2 var3
1 yes 5 6 5 3 1
2 yes 3 8 NA NA NA
3 no 1 9 NA NA NA
4 yes 1 4 NA NA NA
5 no 7 2 NA NA NA
I would like to update my var1:var3
columns by applying a function to each row where IF marker
= yes
AND id_out
matches ANY of the var1:var3
, replace any of var1:var3
with id_new
. I found this solution, but works for one line of code and still requires each new var1:var3
part of the row to update.
data_start[1, 4:6][data_start[1, 4:6] == data_start[1,"id_out"]] <- data_start[1,"id_new"]
Each row also depends on using the values from the above row before again applying the function.
The final output would look like this where the rows stay unchanged when the marker = no
and each row is subsequently updated.
> data_final
marker id_out id_new var1 var2 var3
1 yes 5 6 6 3 1
2 yes 3 8 6 8 1
3 no 1 9 6 8 1
4 yes 1 4 6 8 4
5 no 7 2 6 8 4
This is possible to use with any number of columns and works with base R:
cols <- c("var1", "var2", "var3")
for(j in 1:length(cols)) {
var <- cols[j]
for(i in 1:nrow(data_start)){
if(i > 1) {
data_start[i, var] <- data_start[i-1, var]
}
if(data_start[i, "marker"] == "yes" & data_start[i, var] == data_start[i,"id_out"]) {
data_start[i,var] <- data_start[i, "id_new"]
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With