I am trying to figure out if there is a good way of using glue() in j of a data.table:
library(data.table)
library(glue)
data(iris)
dt.iris <- data.table(iris)
dt.iris[, myText := glue('The species is {Species} with sepal length of {Sepal.Length}')]
# Error in eval(parse(text = text, keep.source = FALSE), envir) :
# object 'Species' not found
I can use it if I indicate .envir = .SD:
dt.iris[, myText := glue('The species is {Species} with sepal length of {Sepal.Length}', .envir = .SD)]
# works OK
but I am wondering if I can find some way without adding this every time. Maybe something like that:
glue1 <- function(...) glue(..., .envir = ???)
Why not simply using sprintf,
> library(data.table)
> dt.iris[, myText := sprintf('The species is %s with sepal length of %.2g',
+ Species, Sepal.Length)]
or paste, which is considerably slower though.
> dt.iris[, myText := paste('The species is', Species, 'with sepal length of', Sepal.Length)]
> dt.iris
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1: 5.1 3.5 1.4 0.2 setosa
2: 4.9 3.0 1.4 0.2 setosa
3: 4.7 3.2 1.3 0.2 setosa
4: 4.6 3.1 1.5 0.2 setosa
5: 5.0 3.6 1.4 0.2 setosa
---
146: 6.7 3.0 5.2 2.3 virginica
147: 6.3 2.5 5.0 1.9 virginica
148: 6.5 3.0 5.2 2.0 virginica
149: 6.2 3.4 5.4 2.3 virginica
150: 5.9 3.0 5.1 1.8 virginica
myText
1: The species is setosa with sepal length of 5.1
2: The species is setosa with sepal length of 4.9
3: The species is setosa with sepal length of 4.7
4: The species is setosa with sepal length of 4.6
5: The species is setosa with sepal length of 5
---
146: The species is virginica with sepal length of 6.7
147: The species is virginica with sepal length of 6.3
148: The species is virginica with sepal length of 6.5
149: The species is virginica with sepal length of 6.2
150: The species is virginica with sepal length of 5.9
library(data.table)
dt.iris <- as.data.table(iris)
dt.iris.l <- dt.iris[sample.int(nrow(dt.iris), 1e6, replace=TRUE), ]
gluedt <- function(...) glue::glue(..., .envir = parent.frame(3)$x)
microbenchmark::microbenchmark(
sprintf=dt.iris.l[, myText := sprintf('The species is %s with sepal length of %.2g',
Species, Sepal.Length)],
paste=dt.iris.l[, myText := paste('The species is', Species, 'with sepal length of', Sepal.Length)] ,
gluedt=dt.iris.l[, myText := gluedt('The species is {Species} with sepal length of {Sepal.Length}')],
times=3L,
check='identical'
)
$ Rscript --vanilla foo.R
Unit: milliseconds
expr min lq mean median uq max neval cld
sprintf 748.210 755.7418 758.8391 763.2735 764.1537 765.0338 3 a
paste 1545.685 1547.1562 1549.3632 1548.6278 1551.2025 1553.7771 3 b
gluedt 1426.333 1437.6870 1443.4343 1449.0413 1451.9851 1454.9289 3 c
Data:
> dt.iris <- as.data.table(iris)
My approach is to simply use glue_data:
dt.iris[Sepal.Width > 4, myText := glue_data(.SD, "The species is {Species} with sepal length of {Sepal.Length}")]
I think it is due to the way glue treats everything as one string "The species is {Species} with sepal length of {Sepal.Length}", instead of separating string and variables like paste or sprintf as per usual in R so that data.table will work normally.
Another approach is to use metaprogramming:
gluedt <- function(...) substitute(glue(..., .envir = .SD))
dt.iris[Sepal.Width > 4, myText := eval(gluedt("The species is {Species} with sepal length of {Sepal.Length}"))]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With