I want to change factor levels of a column using setattr
. However, when the column is selected the standard data.table
way (dt[ , col]
), the levels
are not updated. On the other hand, when selecting the column in an unorthodox way in a data.table
setting—namely using $
—it works.
library(data.table)
# Some data
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# We want to change levels of 'x' using setattr
# New desired levels
lev <- c("a_new", "b_new")
# Select column in the standard data.table way
setattr(x = d[ , x], name = "levels", value = lev)
# Levels are not updated
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# Select column in a non-standard data.table way using $
setattr(x = d$x, name = "levels", value = lev)
# Levels are updated
d
# x y
# 1: b_new 1
# 2: a_new 2
# 3: a_new 3
# 4: b_new 4
# Just check if d[ , x] really is the same as d$x
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
identical(d[ , x], d$x)
# [1] TRUE
# Yes, it seems so
It feels like I'm missing some data.table
(R
?) basics here. Can anyone explain what's going on?
I have found two other post on setattr
and levels
:
setattr
on levels
preserving unwanted duplicates (R data.table)
How does one change the levels of a factor column in a data.table
Both of them used $
to select the column. Neither of them mentioned the [ , col]
way.
It might help to understand if you look at the address from both expressions:
address(d$x)
# [1] "0x10e4ac4d8"
address(d$x)
# [1] "0x10e4ac4d8"
address(d[,x])
# [1] "0x105e0b520"
address(d[,x])
# [1] "0x105e0a600"
Note that the address from the first expression doesn't change when you call it multiple times, while the second expression does which indicates it is making a copy of the column due to the dynamic nature of the address, so setattr
on it will have no effect on the original data.table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With