Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr and reusable argument lists

Tags:

r

dplyr

I have played around with dplyr a little and really like it. I am missing something though. In plyr, I was able to pass a functions to ddplyand reuse it.

library('dplyr')
library('plyr')

fn = function(df) {
    summarise(df,
        count = length(id))
}

ddply(DF1,'group', fn)
ddply(DF2,'group', fn)

So I can apply a long list of recordings to multiple datasets without replicating all the arguments to summarise. In dplyr, however, I have to do this

dplyr::summarise(group_by(DF1,group),
    count = length(id))
dplyr::summarise(group_by(DF2,group),
    count = length(id))

So the arguments to summarise have to be repeated each time. A list of arguments with list('.data'=DF1,'count'=length(id)) and do.call does not work either because length(id) is evaluated when I define the argument list. Are there any solutions for this?

like image 471
user2503795 Avatar asked Dec 20 '22 19:12

user2503795


1 Answers

I like @RomanLustrik answer, so here's a 100% dplyr approach to his answer.

do(mylist, function(df)
   df %.%
   group_by(b) %.%
   summarise(count = n()))

## [[1]]
## Source: local data frame [2 x 2]

##   b count
## 1 b     5
## 2 a     5

## [[2]]
## Source: local data frame [2 x 2]

##   b count
## 1 b     5
## 2 a     5

In this answer I just tried to replicate Roman's approach but you can reuse your function (fn) :

fn <- function(df) {
    summarise(df,
        count = n())
}

group_by(df1, b) %.% fn()
## Source: local data frame [2 x 2]

##   b count
## 1 b     5
## 2 a     5

group_by(df2, b) %.% fn()
## Source: local data frame [2 x 2]

##   b count
## 1 b     5
## 2 a     5

And you can even wrap it like this

do(list(df1, df2), function(df) group_by(df, b) %.% fn())
like image 113
dickoa Avatar answered Dec 22 '22 10:12

dickoa