I'd like to create user where user can pass dplyr code as parameter. User always needs to add codes for both variants, that is for df without category and for df with it. Function itself recognize which code will be used - based on presence category column in dataframe.
Lets say I have this datasets
library(dplyr)
start_date <- as.Date("2024-01-01")
end_date <- Sys.Date()
dates <- seq(from = start_date, to = end_date, by = "week")
states <- c("California", "Texas", "Florida", "New York", "Illinois", "Ohio", "Georgia", "North Carolina", "Michigan", "Pennsylvania")
category <- c("A", "B", "C")
df1 <- expand.grid(date = dates, state = sample(states, length(dates), replace = TRUE)) %>%
mutate(
units = sample(100:1000, nrow(.), replace = TRUE),
amount = round(runif(nrow(.), min = 1000, max = 10000), 2)
)
df2 <- expand.grid(date = dates, state = sample(states, length(dates), replace = TRUE), category = sample(category, length(dates), replace = TRUE)) %>%
mutate(
units = sample(100:1000, nrow(.), replace = TRUE),
amount = round(runif(nrow(.), min = 1000, max = 10000), 2)
)
I created this function that detects if category column is present or not.
apply_conditional_summary <- function(data, code1, code2) {
if ("category" %in% colnames(data)) {
result <- eval(substitute(data %>% code2))
} else {
result <- eval(substitute(data %>% code1))
}
return(result)
}
I'd like to call it this way
apply_conditional_summary(
df1,
group_by(date, states) %>% summarise(units = mean(units), total_amt = sum(amount), .groups = "drop"),
group_by(date, states, category) %>%
summarise(units = mean(units), total_amt = sum(amount), .groups = "drop") %>%
group_by(date, states) %>%
mutate(share = total_amt / sum(total_amt)) %>%
ungroup()
)
But I get error
Error in `%>%`(., group_by(date, states), summarise(units = mean(units), :
unused argument (summarise(units = mean(units), total_amt = sum(amount), .groups = "drop"))
Fun problem!
I made some slight changes. For example, I added data %>% to code1 and code2 because I can't remember how to concatenate that programmatically. I prefer rlang to base R's metaprogramming. You can read the metaprogramming section of Advanced R for a deep dive or this PDF cheat sheet for a quick intro.
I used dplyr::enexpr, replacing eval(substitute(...)). You also mistyped some of the column names, e.g. states which should have been state.
Lastly, I used rlang::eval_tidy to run the code.
I haven't checked that the output is correct, however. I'll leave that to you, and please let me know!
apply_conditional_summary <- function(data, code1, code2) {
if ("category" %in% colnames(data)) {
result <- enexpr(code2)
} else {
result <- enexpr(code1)
}
return(rlang::eval_tidy(result))
}
apply_conditional_summary(
df2,
data %>% group_by(date, state) %>% summarise(units = mean(units), total_amt = sum(amount), .groups = "drop"),
data %>% group_by(date, state, category) %>%
summarise(units = mean(units), total_amt = sum(amount), .groups = "drop") %>%
group_by(date, state) %>%
mutate(share = total_amt / sum(total_amt)) %>%
ungroup()
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With