Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use dplyr's summarise and summarise_each together?

Tags:

r

dplyr

I would like to apply dplyr::summarise and dplyr::summarise_each at the same time for a grouped data frame. Is it possible?

My data looks like this:

mydf <- data.frame(
    id = c(rep(1,2), rep(2, 3), rep(3, 4)), 
    amount = c(rep(1,4), rep(2,5)), 
    type1 = c(rep(1, 2), rep(0, 7)),
    type2 = c(rep(0, 4), rep(1, 5))
)
mydf
#  id amount type1 type2
#1  1      1     1     0
#2  1      1     1     0
#3  2      1     0     0
#4  2      1     0     0
#5  2      2     0     1
#6  3      2     0     1
#7  3      2     0     1
#8  3      2     0     1
#9  3      2     0     1

I would like to sum over id the amount variable and get the max for the type variables. I know I can do this as follows:

mydf %>% 
    group_by(id) %>% 
    summarise(amount = sum(amount), type1 = max(type1), type2 = max(type2))

However, I have a lot of type variables so I would prefer something like this (but with the sum of amount as well).

mydf %>%
    group_by(id) %>%
    summarise_each(funs(max), matches("type"))
like image 359
janosdivenyi Avatar asked Aug 04 '15 17:08

janosdivenyi


People also ask

How do you summarize multiple columns?

Press "Ctrl + Space" to select it, then hold "Shift" and using the lateral arrow keys to select the other columns. After selecting all the columns you want to add together, the bar should display a formula such as "=SUM(A:C)," with the range displaying the column letter names.

How do I use summarize function in R?

Step 1: Select data frame. Step 2: Group data. Step 3: Summarize the data. Step 4: Plot the summary statistics.

How do I summarize data in R?

You can make use of pipe operator for summarising the data set. Pipe operator comes under magrittr package. Let's load the package. Based on pipe operator you can easily summarize and plot it with the help of ggplot2.


1 Answers

Using dplyr

library(dplyr)

mydf %>% 
     group_by(id) %>% 
     mutate(amount = sum(amount)) %>% 
     mutate_each(funs(max), matches("type")) %>%
     unique

#Source: local data table [3 x 4]

#  id amount type1 type2
#1  1      2     1     0
#2  2      4     0     1
#3  3      8     0     1

Or simply as @HongOoi indicated

mydf %>% 
     group_by(id) %>% 
     mutate(amount=sum(amount)) %>% 
     summarise_each(funs(max))
like image 171
Veerendra Gadekar Avatar answered Oct 09 '22 23:10

Veerendra Gadekar