I'm analyzing donor data from an appeal. Let's say we have this data frame, which has an ID, the amount given, and the ask amounts from an appeal card:
dfgive <- data.frame(id = c(1, 2, 3, 4, 5),
gift = c(20, 1000, 1500, 300, 500),
ask1 = c(50, 5000, 5000, 100, 2500),
ask2 = c(100, 3750, 3750, 250, 5000),
ask3 = c(250, 2500, 2500, 500, 10000),
gift_diff_1 = c(-30, -4000, -3500, 200, -2000),
gift_diff_2 = c(-80, -2750, -2250, 50, -4500),
gift_diff_3 = c(-230, -1500, -1000, -200, -9500),
mindiff = c(-230, -4000, -3500, -200, -9500))
The gift difference from each ask amount and the minimum difference were created with the code below.
mutate(gift_diff_1 = gift - ask1) %>%
mutate(gift_diff_2 = gift - ask2) %>%
mutate(gift_diff_3 = gift - ask3) %>%
mutate(mindiff=min(gift_diff_1, gift_diff_2, gift_diff_3))
What I'm hoping to do next is create a new column called gift_closest
that would identify which of the ask amounts (ask1, ask2, or ask3) is closest in absolute value to the gift. We could use either the derived gift_diff columns or do it wholly within the equation. I can't get my head around how to best do it.
The value of the new column would be one of "ask1", "ask2", or "ask3". Ideally in a dplyr
chain as that's where I'm creating the entire data set which will include a bunch of demographic fields.
thanks...
You might consider reshaping the data if you have many different ask
columns, to reduce duplication of code. The result below can be joined back onto the original dfgive if you want to keep other columns in that dataframe.
library(tidyverse)
dfgive <- data.frame(
id = c(1, 2, 3, 4, 5),
gift = c(20, 1000, 1500, 300, 500),
ask1 = c(50, 5000, 5000, 100, 2500),
ask2 = c(100, 3750, 3750, 250, 5000),
ask3 = c(250, 2500, 2500, 500, 10000)
)
dfgive %>%
pivot_longer(
cols = starts_with("ask"),
names_to = "ask_num",
values_to = "ask"
) %>%
mutate(gift_diff = gift - ask) %>%
group_by(id) %>%
summarise(
mindiff = min(gift_diff),
gift_closest = ask_num[which.min(abs(gift_diff))]
)
#> # A tibble: 5 x 3
#> id mindiff gift_closest
#> <dbl> <dbl> <chr>
#> 1 1 -230 ask1
#> 2 2 -4000 ask3
#> 3 3 -3500 ask3
#> 4 4 -200 ask2
#> 5 5 -9500 ask1
First of all: I guess your mindiff
-part won't do what you expect it to do. Replace min
with pmin
.
library(dplyr)
dfgive %>%
mutate(abs = pmin(abs(gift - ask1), abs(gift - ask2), abs(gift - ask3)),
gift_close = case_when(abs(gift - ask1) == abs ~ "ask1",
abs(gift - ask2) == abs ~ "ask2",
abs(gift - ask3) == abs ~ "ask3",
TRUE ~ NA_character_))
returns
id gift ask1 ask2 ask3 gift_diff_1 gift_diff_2 gift_diff_3 mindiff abs gift_close
1 1 20 50 100 250 -30 -80 -230 -230 30 ask1
2 2 1000 5000 3750 2500 -4000 -2750 -1500 -4000 1500 ask3
3 3 1500 5000 3750 2500 -3500 -2250 -1000 -3500 1000 ask3
4 4 300 100 250 500 200 50 -200 -200 50 ask2
5 5 500 2500 5000 10000 -2000 -4500 -9500 -9500 2000 ask1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With