Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R use dplyr as function parameter

Tags:

r

dplyr

I'd like to create user where user can pass dplyr code as parameter. User always needs to add codes for both variants, that is for df without category and for df with it. Function itself recognize which code will be used - based on presence category column in dataframe.

Lets say I have this datasets

library(dplyr)

start_date <- as.Date("2024-01-01")
end_date <- Sys.Date()
dates <- seq(from = start_date, to = end_date, by = "week")
states <- c("California", "Texas", "Florida", "New York", "Illinois", "Ohio", "Georgia", "North Carolina", "Michigan", "Pennsylvania")
category <- c("A", "B", "C")

df1 <- expand.grid(date = dates, state = sample(states, length(dates), replace = TRUE)) %>%
  mutate(
    units = sample(100:1000, nrow(.), replace = TRUE),
    amount = round(runif(nrow(.), min = 1000, max = 10000), 2)
  )

df2 <- expand.grid(date = dates, state = sample(states, length(dates), replace = TRUE), category = sample(category, length(dates), replace = TRUE)) %>%
  mutate(
    units = sample(100:1000, nrow(.), replace = TRUE),
    amount = round(runif(nrow(.), min = 1000, max = 10000), 2)
  )

I created this function that detects if category column is present or not.

apply_conditional_summary <- function(data, code1, code2) {

  if ("category" %in% colnames(data)) {

    result <- eval(substitute(data %>% code2))
  } else {

    result <- eval(substitute(data %>% code1))
  }
  
  return(result)
}

I'd like to call it this way

apply_conditional_summary(
  df1, 
  group_by(date, states) %>% summarise(units = mean(units), total_amt = sum(amount), .groups = "drop"),
  group_by(date, states, category) %>% 
    summarise(units = mean(units), total_amt = sum(amount), .groups = "drop") %>%
    group_by(date, states) %>%
    mutate(share = total_amt / sum(total_amt)) %>%
    ungroup()
)

But I get error

Error in `%>%`(., group_by(date, states), summarise(units = mean(units),  : 
  unused argument (summarise(units = mean(units), total_amt = sum(amount), .groups = "drop"))
like image 766
prdel99 Avatar asked Mar 04 '26 22:03

prdel99


1 Answers

Fun problem!

I made some slight changes. For example, I added data %>% to code1 and code2 because I can't remember how to concatenate that programmatically. I prefer rlang to base R's metaprogramming. You can read the metaprogramming section of Advanced R for a deep dive or this PDF cheat sheet for a quick intro.

I used dplyr::enexpr, replacing eval(substitute(...)). You also mistyped some of the column names, e.g. states which should have been state.

Lastly, I used rlang::eval_tidy to run the code.

I haven't checked that the output is correct, however. I'll leave that to you, and please let me know!

apply_conditional_summary <- function(data, code1, code2) {
  
  if ("category" %in% colnames(data)) {
    result <- enexpr(code2)
  } else {
    result <- enexpr(code1)
  }
  
  return(rlang::eval_tidy(result))
}

apply_conditional_summary(
  df2, 
  data %>% group_by(date, state) %>% summarise(units = mean(units), total_amt = sum(amount), .groups = "drop"),
  data %>% group_by(date, state, category) %>% 
    summarise(units = mean(units), total_amt = sum(amount), .groups = "drop") %>%
    group_by(date, state) %>%
    mutate(share = total_amt / sum(total_amt)) %>%
    ungroup()
)
like image 117
drj3122 Avatar answered Mar 06 '26 13:03

drj3122



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!