I have a data.table that I am trying to summarise. This is my approach
library(data.table)
dtIris <-data.table(iris)
dt1 <- dtIris[, list(AvgSepalWidth = mean(Sepal.Width)), 
              by=list(TrimSpecies = substr(Species,1,3),Petal.Length)]
I am wanting to be able to use a variable to identify one of the items to group by, I just can't get it to evalulate the variable in the list. It just treats it like a string and throws an error.
myvar <- "Petal.Length"
dt1 <- dtIris[, list(AvgSepalWidth = mean(Sepal.Width)), 
              by=list(TrimSpecies = substr(Species,1,3),myvar)]
I have tried noquote(), eval(), parse(text=) all to no avail. Any guidance would be really appreciated.
You can use eval(parse(text=myvar)) or get(myvar) though that will name your grouping column parse or get respectively (then you could rename it).
myvar <- "Petal.Length"
dtIris[, list(AvgSepalWidth = mean(Sepal.Width)), 
              by=list(TrimSpecies = substr(Species,1,3), eval(parse(text=myvar)))]
dtIris[, list(AvgSepalWidth = mean(Sepal.Width)), 
              by=list(TrimSpecies = substr(Species,1,3), get(myvar))]
I am not sure how to do it in a way that preserves the name like you want it to. (Edit: by=setNames(list(...), c('TrimSpecies', myvar)) - thanks @thelatemail!)
Edit - out of interest, in response to some comments below.
library(rbenchmark)
benchmark(
    eval=dtIris[, list(AvgSepalWidth = mean(Sepal.Width)), 
              by=list(TrimSpecies = substr(Species,1,3), eval(parse(text=myvar)))],
    get=dtIris[, list(AvgSepalWidth = mean(Sepal.Width)), 
              by=list(TrimSpecies = substr(Species,1,3), get(myvar))],
    chain=dtIris[, TrimSpecies := substr(Species,1,3)][,list(AvgSepalWidth = mean(Sepal.Width)),by=c("TrimSpecies",myvar)][,TrimSpecies:=NULL][]
)
   test replications elapsed relative user.self sys.self user.child sys.child
3 chain          100   0.151    1.987     0.250        0          0         0
1  eval          100   0.079    1.039     0.097        0          0         0
2   get          100   0.076    1.000     0.094        0          0         0
get is faster than eval(parse(text=..))) which is faster than defining TrimSpecies, using the character form of by and then removing it (chaining dts).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With