> df = data.frame(id = 1:5, ch_1 = 11:15,ch_2= 10:14,selection = c(11,13,12,14,12))
> df
id ch_1 ch_2 selection
1 1 11 10 11
2 2 12 11 13
3 3 13 12 12
4 4 14 13 14
5 5 15 14 12
Given this data set I need an additional column that follow the rules:
I need a way to do this for every row. For a single row, doing the following code works just fine, but I can't seem to find a way to use it with apply
to run it to each single row of a dataframe.Looking for a solution that can be applied to more than just two columns and that runs faster than doing a traditional loop
df=df[1,]
if (df$selection %in% df[,paste("ch_",1:2,sep="")]) {
a = which(df[,paste("ch_",1:2,sep="")]==df$selection)
} else {
a = 3
}
# OR
ifelse(df$selection %in% df[,paste("ch_",1:2,sep="")],1,3)
# OR
match(df$selection,df[,paste("ch_",1:2,sep="")])
Compare the vector to the other columns with ==
, add a final column which is always TRUE
, and then take the index of the first TRUE
in each row using max.col
max.col(cbind(df$selection == df[c("ch_1","ch_2")], TRUE), "first")
#[1] 1 3 2 1 3
This should easily extend to n columns then.
You could do this with nested ifelse
,
with(df, ifelse(selection == ch_1, 1L, ifelse(selection == ch_2, 2L, 3L)))
# [1] 1 3 2 1 3
but I'm rarely fond of nesting them. If this is all you need (and you never need more than two), then this might suffice.
One alternative is using dplyr::case_when
,
with(df, dplyr::case_when(selection == ch_1 ~ 1, selection == ch_2 ~ 2, TRUE ~ 3))
and it can be easily used within a dplyr::mutate
if you are already using the package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With