What is the idiomatic data.table approach to produce a data.table with separate columns for elements of a vector returned by a function, calculated by group?
Consider the data.table:
library(data.table)
data(iris)
setDT(iris)
If the function is range(), I'd want the output similar to:
iris[, .(min_petal_width = min(Petal.Width), 
         max_petal_width = max(Petal.Width)
         ), keyby = Species] # produces desired output
but using the range() function.
I can use dcast, but it's ugly:
dcast(
  iris[, .( petal_width = range(Petal.Width), 
            value = c("min_petal_width", "max_petal_width")), 
       keyby = Species],
  Species ~ value, value.var = "petal_width")
I'm hoping there's a simpler expression, along the lines of:
iris[, (c("min_petal_width","max_petal_width")) = range(Petal.Width), 
      keyby = Species] # doesn't work
                You can also do:
dt[, lapply(list(min=min, max=max), function(f) f(Petal.Width)), by=Species]
#       Species min max
# 1:     setosa 0.1 0.6
# 2: versicolor 1.0 1.8
# 3:  virginica 1.4 2.5
                        Your approach was very close. Just remember that you need to feed a list to data.table and it will happily accept it. Hence, you can use:
iris[, c("min_petal_width","max_petal_width") := as.list(range(Petal.Width)), 
     by = Species]
I misread the question.. Since you want to aggregate the result instead of adding new columns, you could use
cols <- c("min_petal_width", "max_petal_width")
iris[, setNames(as.list(range(Petal.Width)), cols), keyby = Species] 
But I'm sure there are a few other data.table approaches, too.
If readability and conciseness is really important to you, I would define a custom function or binary operator which you can then easily use in your data.table subset expression, e.g. :
# custom function
.nm <- function(v,vnames){
  `names<-`(as.list(v),vnames)
}
# custom binary operator
`%=%` <- function(vnames,v){
  `names<-`(as.list(v),vnames)
}
# using custom function
iris[, .nm(range(Petal.Width),c("min_petal_width", "max_petal_width")), keyby = Species]
# using custom binary operator
iris[, c("min_petal_width", "max_petal_width") %=% range(Petal.Width), keyby = Species]
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With