I'd like to use dplyr to group a table by one column, then apply a function to the set of values in the second column of each group.
For instance, in the code example below, I'd like to return all of the 2-item combinations of foods eaten by each person. I cannot figure out how to properly supply the function with the proper column (foods) in the do()
function.
library(dplyr)
person = c( 'Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob' )
foods = c( 'apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana' )
eaten = data.frame(person, foods)
by_person = group_by(eaten, person)
# How to do this?
do( by_person, combn( x = foods, m = 2 ) )
Note that the example code in ?do
fails on my machine
mods <- do(carriers, failwith(NULL, lm), formula = ArrDelay ~ date)
%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).
Group_by() function belongs to the dplyr package in the R programming language, which groups the data frames. Group_by() function alone will not give any output. It should be followed by summarise() function with an appropriate action to perform. It works similar to GROUP BY in SQL and pivot table in excel.
The group_by() method is used to group the data contained in the data frame based on the columns specified as arguments to the function call.
To select a specific column, you can also type in the name of the dataframe, followed by a $ , and then the name of the column you are looking to select. In this example, we will be selecting the payment column of the dataframe. When running this script, R will simplify the result as a vector.
Let us define eaten
like this:
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
1) Then try this:
eaten %.% group_by(person) %.% do(function(x) combn(x$foods, m = 2))
giving:
[[1]]
[,1] [,2] [,3]
[1,] "apple" "apple" "banana"
[2,] "banana" "cucumber" "cucumber"
[[2]]
[,1] [,2] [,3]
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber" "banana" "banana"
2) To be able to do something near to what @Hadley describes in the comments without waiting for a future version of dplyr try this where do2
is found here:
library(gsubfn)
eaten %.% group_by(person) %.% fn$do2(~ combn(.$foods, m = 2))
giving:
$Grace
[,1] [,2] [,3]
[1,] "apple" "apple" "banana"
[2,] "banana" "cucumber" "cucumber"
$Rob
[,1] [,2] [,3]
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber" "banana" "banana"
Note: The last line of the question giving the code in the help file also fails for me. This variation of it works for me: do(jan, lm, formula = ArrDelay ~ date)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With