For a single column:
dtnew <- dt[, Quarter:=as.character(Quarter)]
str(dtnew)
Classes ‘data.table’ and 'data.frame': 10 obs. of 3 variables:
$ ID : Factor w/ 2 levels "A","B": 1 1 1 1 1 2 2 2 2 2
$ Quarter: chr "1" "2" "3" "4" ...
$ value : num -0.838 0.146 -1.059 -1.197 0.282 ...
Using lapply
and as.character
:
dtnew <- dt[, lapply(.SD, as.character), by=ID]
str(dtnew)
Classes ‘data.table’ and 'data.frame': 10 obs. of 3 variables:
$ ID : Factor w/ 2 levels "A","B": 1 1 1 1 1 2 2 2 2 2
$ Quarter: chr "1" "2" "3" "4" ...
$ value : chr "1.487145280568" "-0.827845218358881" "0.028977182770002" "1.35392750102305" ...
Try this
DT <- data.table(X1 = c("a", "b"), X2 = c(1,2), X3 = c("hello", "you"))
changeCols <- colnames(DT)[which(as.vector(DT[,lapply(.SD, class)]) == "character")]
DT[,(changeCols):= lapply(.SD, as.factor), .SDcols = changeCols]
Raising Matt Dowle's comment to Geneorama's answer (https://stackoverflow.com/a/20808945/4241780) to make it more obvious (as encouraged), you can use for(...)set(...)
.
library(data.table)
DT = data.table(a = LETTERS[c(3L,1:3)], b = 4:7, c = letters[1:4])
DT1 <- copy(DT)
names_factors <- c("a", "c")
for(col in names_factors)
set(DT, j = col, value = as.factor(DT[[col]]))
sapply(DT, class)
#> a b c
#> "factor" "integer" "factor"
Created on 2020-02-12 by the reprex package (v0.3.0)
See another of Matt's comments at https://stackoverflow.com/a/33000778/4241780 for more info.
Edit.
As noted by Espen and in help(set)
, j
may be "Column name(s) (character) or number(s) (integer) to be assigned value when column(s) already exist". So names_factors <- c(1L, 3L)
will also work.
This is a BAD way to do it! I'm only leaving this answer in case it solves other weird problems. These better methods are the probably partly the result of newer data.table versions... so it's worth while to document this hard way. Plus, this is a nice syntax example for eval
substitute
syntax.
library(data.table)
dt <- data.table(ID = c(rep("A", 5), rep("B",5)),
fac1 = c(1:5, 1:5),
fac2 = c(1:5, 1:5) * 2,
val1 = rnorm(10),
val2 = rnorm(10))
names_factors = c('fac1', 'fac2')
names_values = c('val1', 'val2')
for (col in names_factors){
e = substitute(X := as.factor(X), list(X = as.symbol(col)))
dt[ , eval(e)]
}
for (col in names_values){
e = substitute(X := as.numeric(X), list(X = as.symbol(col)))
dt[ , eval(e)]
}
str(dt)
which gives you
Classes ‘data.table’ and 'data.frame': 10 obs. of 5 variables:
$ ID : chr "A" "A" "A" "A" ...
$ fac1: Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3 4 5
$ fac2: Factor w/ 5 levels "2","4","6","8",..: 1 2 3 4 5 1 2 3 4 5
$ val1: num 0.0459 2.0113 0.5186 -0.8348 -0.2185 ...
$ val2: num -0.0688 0.6544 0.267 -0.1322 -0.4893 ...
- attr(*, ".internal.selfref")=<externalptr>
If you have a list of column names in data.table, you want to change the class of do:
convert_to_character <- c("Quarter", "value")
dt[, convert_to_character] <- dt[, lapply(.SD, as.character), .SDcols = convert_to_character]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With