Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace multiple `summarize`statements by function

I'm currently repeating a lot code, since I need to summarize always the same columns for different groups. How can I do this effectively by writing the summarize function (which is always the same) only once, but define the output name and group_by arguments case by case?

A minimum example:

col1 <- c("UK", "US", "UK", "US")
col2 <- c("Tech", "Social", "Social", "Tech")
col3 <- c("0-5years", "6-10years", "0-5years", "0-5years")
col4 <- 1:4
col5 <- 5:8

df <- data.frame(col1, col2, col3, col4, col5)

result1 <- df %>% 
  group_by(col1, col2) %>% 
  summarize(sum1 = sum(col4, col5))

result2 <- df %>% 
  group_by(col2, col3) %>% 
  summarize(sum1 = sum(col4, col5))

result3 <- df %>% 
  group_by(col1, col3) %>% 
  summarize(sum1 = sum(col4, col5))
like image 429
huan Avatar asked Jan 02 '23 01:01

huan


1 Answers

Using combn:

combn(colnames(df)[1:3], 2, FUN = function(x){
  df %>% 
    group_by(.dots = x) %>% 
    summarize(sum1 = sum(col4, col5))
  }, simplify = FALSE)
like image 120
zx8754 Avatar answered Jan 08 '23 02:01

zx8754