from my simple data.table, for example, like this:
dt1 <- fread("
col1 col2 col3
AAA ab cd
BBB ef gh
BBB ij kl
CCC mn nm")
I am making new table, for example, like this:
dt1[,
.(col3, new=.N),
by=col1]
> col1 col3 new
>1: AAA cd 1
>2: BBB gh 2
>3: BBB kl 2
>4: CCC op 1
this works fine when I indicate column names explicitly. But when I have them in the variables and try to use with=F
, this gives an error:
colBy <- 'col1'
colShow <- 'col3'
dt1[,
.(colShow, 'new'=.N),
by=colBy,
with=F]
# Error in `[.data.table`(dt1, , .(colShow, new = .N), by = colBy, with = F) : object 'ansvals' not found
I could not find any information about this error so far.
The reason why you are getting this error message is that when using with=FALSE
you tell data.table to treat j
as if it were a dataframe. It therefore expects a vector of columnnames and not an expression to be evaluated in j
as new=.N
.
From the documentation of ?data.table
about with
:
By default
with=TRUE
and j is evaluated within the frame of x; column names can be used as variables. Whenwith=FALSE
j is a character vector of column names or a numeric vector of column positions to select, and the value returned is always a data.table.
When you use with=FALSE
, you have to select the columnnames in j
without a .
before ()
like this: dt1[, (colShow), with=FALSE]
. Other options are dt1[, c(colShow), with=FALSE]
or dt1[, colShow, with=FALSE]
. The same result can be obtained by using dt1[, .(col3)]
To sum up: with = FALSE
is used to select columns the data.frame way. So, you should do it then as such.
Also by using by = colBy
you are telling data.table to evaluate j
which is in contradiction with with = FALSE
.
From the documentation of ?data.table
about j
:
A single column name, single expresson of column names,
list()
of expressions of column names, an expression or function call that evaluates to list (including data.frame and data.table which are lists, too), or (whenwith=FALSE
) a vector of names or positions to select.
j
is evaluated within the frame of the data.table; i.e., it sees column names as if they are variables. Usej=list(...)
to return multiple columns and/or expressions of columns. A single column or single expression returns that type, usually a vector. See the examples.
See also points 1.d and 1.g of the introduction vignette of data.table.
ansvals
is a name used in data.table internals. You can see where it appears in the code by using ctrl+f (Windows) or cmd+f (macOS) here.
The error object 'ansvals' not found
looks like a bug to me. It should either be a helpful message or just work. I've filed issue #1440 linking back to this question, thank you.
Jaap is completely correct. Following on from his answer, you can use get()
in j
like this :
dt1
# col1 col2 col3
#1: AAA ab cd
#2: BBB ef gh
#3: BBB ij kl
#4: CCC mn nm
colBy
#[1] "col1"
colShow
#[1] "col3"
dt1[,.(get(colShow),.N),by=colBy]
# col1 V1 N
#1: AAA cd 1
#2: BBB gh 2
#3: BBB kl 2
#4: CCC nm 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With