If the column names in data.table
are in the form of number + character
, for example: 4PCS
, 5Y
etc, how could this be referenced as j
in x[i,j]
so that it is interpreted as an unquoted column name.
I assume this would solve mine original problem. I wanted to add several column in 'data.table' which were in the form number + character
.
M <- data.table('4PCS'=1:4,'5Y'=4:1,X5Y=2:5)
> M[,4PCS+5Y]
Error: unexpected symbol in "M[,4PCS"
The new column should be a sum of 4PSC
and 5Y
.
Is there a way how to refer to them in data.table
in no quoted form? If these columns are referred in data.table
with the quoted "logic" of data.frame
:
> M[,'5Y',with=FALSE]
5Y
[1,] 4
[2,] 3
[3,] 2
[4,] 1
then there will be a limitation in functionality of such reference. The addition would not work as it does not work in data.frame
:
> M[,'4PCS'+'5Y',with=FALSE]
Error in "4PCS" + "5Y" : non-numeric argument to binary operator
The data.table
functionality would allow to operate over the columns. I would like to find a solution in the new data.table
logic hence I can use its ability to transform the columns by column name referencing.
The question is:
How to quote the column name which start with number so that the data.table logic would understand that it is a column name.
If column names contain any characters except letters, numbers, and underscores, the name must be delimited by enclosing it in back quotes (`).
Rules for R variables are: A variable name must start with a letter and can be a combination of letters, digits, period(.) and underscore(_). If it starts with period(.), it cannot be followed by a digit.
Modify / Add / Delete columns To modify an existing column, or create a new one, use the := operator. Using the data. table := operator modifies the existing object 'in place', which has the benefit of being memory-efficient. Memory management is an important aspect of data.
You cannot make the column names “properly” numeric but in this (character) form you can easily coerce them to be numeric when you need with the as. numeric() command.
I think, this is what you're looking for, not sure. data.table
is different from data.frame
. Please have a look at the quick introduction, and then the FAQ (and also the reference manual if necessary).
require(data.table)
dt <- data.table("4PCS" = 1:3, y=3:1)
# 4PCS y
# 1: 1 3
# 2: 2 2
# 3: 3 1
# access column 4PCS
dt[, "4PCS"]
# returns a data.table
# 4PCS
# 1: 1
# 2: 2
# 3: 3
# to access multiple columns by name
dt[, c("4PCS", "y")]
Alternatively, if you need to access the column and not result in a data.table
, rather a vector, then you can access using the $
notation:
dt$`4PCS` # notice the ` because the variable begins with a number
# [1] 1 2 3
# alternatively, as mnel mentioned under comments:
dt[, `4PCS`]
# [1] 1 2 3
Or if you know the column number you can access using [[.]]
as follows:
dt[[1]] # 4PCS is the first column here
# [1] 1 2 3
Edit:
Thanks @joran. I think you're looking for this:
dt[, `4PCS` + y]
# [1] 4 4 4
Fundamentally the issue is that 4CPS
is not a valid variable name in R (try 4CPS <- 1
, you'll get the same "Unexpected symbol" error). So to refer to it, we have to use backticks (compare`4CPS` <- 1
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With