I am trying to create vectorized conditional outputs of a data frame.
Suppose I have the dataframe:
data <- data.frame(a = c(5, 3, 9, 5),
b = c(1, 2, 3, 4),
c = c(5, 3, 9, 5),
d = c(5, 3, 9, 5))
And the threshold:
threshold <- c(a1 = 4, b1 = 2, c1 = 8, d1 = 2)
What I want is a new dataset that indicates whether each value of a is greater than or equal to the corresponding value in the threshold vector (a1), each value of b is greater than or equal to the corresponding value in the threshold vector (b1), etc.
So the desired output would be:
desired_data <- data.frame(a = c(1, 0, 1, 1),
b = c(0, 1, 1, 1),
c = c(0, 0, 1, 0),
d = c(1, 1, 1, 1))
I want to do this as simply as possible, ideally using a purrr
function.
Here is a wrong attempt:
desired_data <- map(data >= threshold)
I feel like map2
might be promising, and have checked the documentation (e.g., here and here) but I can't seem to get the syntax for conditional outputs based on mapping.
Thank you!
You have the idea with map2()
: since a data frame is a list of column, you can loop on the columns. The small difficulty is to put everything in a data frame at the end, it's done automatically if you use map2_df()
.
map2_df(threshold, data, ~ .y >= .x)
And if you want these logical values to be converted to integers:
1L * map2_df(threshold, data, ~ .y >= .x)
# a1 b1 c1 d1
# 1 1 0 0 1
# 2 0 0 0 1
# 3 1 1 1 1
# 4 1 1 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With