I'm not sure which function to use to do the following:
library(data.table)
dt = data.table(a = 1:4, b = 1:2)
dt[, rep(a[1], 3), by = b]
# b V1
#1: 1 1
#2: 1 1
#3: 1 1
#4: 2 2
#5: 2 2
#6: 2 2
Both summarise
and mutate
are unhappy with this length:
library(dplyr)
df = data.frame(a = 1:4, b = 1:2)
df %.% group_by(b) %.% summarise(rep(a[1], 3))
#Error: expecting a single value
df %.% group_by(b) %.% mutate(rep(a[1], 3))
#Error: incompatible size (3), expecting 2 (the group size) or 1
One great feature of the group_by function is its ability to group by more than one variable to show what the aggregated data looks like for combinations of the different variables across the response variable. All that you need to do is add a comma between the different variables in group_by .
group_by() takes an existing tbl and converts it into a grouped tbl where operations are performed "by group". ungroup() removes grouping.
Group_by() function belongs to the dplyr package in the R programming language, which groups the data frames.
The group_by() method in tidyverse can be used to accomplish this. When working with categorical variables, you may use the group_by() method to divide the data into subgroups based on the variable's distinct categories.
In dplyr
version 0.2 you could do this using the do
operator:
> df %>% group_by(b) %>% do(data.frame(a = rep(.$a[1], 3)))
#Source: local data frame [6 x 2]
#Groups: b
#
# b a
#1 1 1
#2 1 1
#3 1 1
#4 2 2
#5 2 2
#6 2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With