I have a column which contains a mixed string of characters, I've created columns to represent each one of the unique characters in the string. I need to encode the columns with a [1,0]
if any of the characters in the string matches one of these columns.
library(data.table)
d = data.table(string = c("P_P_F_", "U_F_/", "-_P_B"),
P = c(1, 0, 1),
F = c(1, 1, 0),
U = c(0, 1, 0),
B = c(0, 0, 1))
In the example above string
has the characters I need matching to the corresponding columns. The first string has a P
and F
so I have a 1
in those columns and a 0
in the rest.
The characters within the string are always separated by an underscore and has a maximum length of 7.
The data set is quite large so I would prefer a data.table solution is possible.
We can use mtabulate
after splitting the string
library(qdapTools)
cbind(d, mtabulate(strsplit(d$string, "[_/-]")))
d <- data.table(string = c("P_P_F_", "U_F_/", "-_P_B"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With