In data.table v.1.9.6
you can split a variable in columns like so:
library(data.table)
DT = data.table(x=c("A/B", "A", "B"), y=1:3)
DT[, c("c1", "c2") := tstrsplit(x, "/", fixed=TRUE)][]
The number of required splits [above: 2] is not always known in advance. How can I generate the required variable names when the number of splits is known?
n = 2 # desired number of splits
# naive attempt to build required string
m = paste0("'", "myvar", 1:n, "'", collapse = ",")
m = paste0("c(", m, ")" )
# [1] "c('myvar1','myvar2','myvar3')"
DT[, m := tstrsplit(x, "/", fixed=TRUE)][] # doesn't work
Two methods. The first is strongly suggested:
#one
n=2
DT[, paste0("myvar", 1:n) := tstrsplit(x, "/", fixed=T)][]
# x y myvar1 myvar2
#1: A/B 1 A B
#2: A 2 A NA
#3: B 3 B NA
#two
DT[, eval(parse(text=m)) := tstrsplit(x, "/", fixed=TRUE)][]
# x y myvar1 myvar2
#1: A/B 1 A B
#2: A 2 A NA
#3: B 3 B NA
extra
If you do not know the amount of splits beforehand:
splits <- max(lengths(strsplit(DT$x, "/")))
DT[, paste0("myvar", 1:splits) := tstrsplit(x, "/", fixed=T)][]
Another simple way of doing this. Instead of making extra columns, you can stack the splitted strings in a single column:
DT = data.table(x=c("A/B", "A", "B"), y=1:3)
DT1 <- DT[, .(new=tstrsplit(x, "/",fixed=T)), by=y]
DT1
# y new
# 1: 1 A
# 2: 1 B
# 3: 2 A
# 4: 3 B
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With