I would like to apply dplyr::summarise
and dplyr::summarise_each
at the same time for a grouped data frame. Is it possible?
My data looks like this:
mydf <- data.frame(
id = c(rep(1,2), rep(2, 3), rep(3, 4)),
amount = c(rep(1,4), rep(2,5)),
type1 = c(rep(1, 2), rep(0, 7)),
type2 = c(rep(0, 4), rep(1, 5))
)
mydf
# id amount type1 type2
#1 1 1 1 0
#2 1 1 1 0
#3 2 1 0 0
#4 2 1 0 0
#5 2 2 0 1
#6 3 2 0 1
#7 3 2 0 1
#8 3 2 0 1
#9 3 2 0 1
I would like to sum over id
the amount
variable and get the max for the type
variables. I know I can do this as follows:
mydf %>%
group_by(id) %>%
summarise(amount = sum(amount), type1 = max(type1), type2 = max(type2))
However, I have a lot of type
variables so I would prefer something like this (but with the sum of amount
as well).
mydf %>%
group_by(id) %>%
summarise_each(funs(max), matches("type"))
Press "Ctrl + Space" to select it, then hold "Shift" and using the lateral arrow keys to select the other columns. After selecting all the columns you want to add together, the bar should display a formula such as "=SUM(A:C)," with the range displaying the column letter names.
Step 1: Select data frame. Step 2: Group data. Step 3: Summarize the data. Step 4: Plot the summary statistics.
You can make use of pipe operator for summarising the data set. Pipe operator comes under magrittr package. Let's load the package. Based on pipe operator you can easily summarize and plot it with the help of ggplot2.
Using dplyr
library(dplyr)
mydf %>%
group_by(id) %>%
mutate(amount = sum(amount)) %>%
mutate_each(funs(max), matches("type")) %>%
unique
#Source: local data table [3 x 4]
# id amount type1 type2
#1 1 2 1 0
#2 2 4 0 1
#3 3 8 0 1
Or simply as @HongOoi indicated
mydf %>%
group_by(id) %>%
mutate(amount=sum(amount)) %>%
summarise_each(funs(max))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With