I have a character vector like this:
a <- c("a,b,c", "a,b", "a,b,c,d")
What I would like to do is create a data frame where the individual letters in each string are represented by dummy columns:
a b c d
1] 1 1 1 0
2] 1 1 0 0
3] 1 1 1 1
I have a feeling that I need to be using some combination of read.table
and reshape
but am really struggling. Any and help appreciated.
You can try cSplit_e
from my "splitstackshape" package:
library(splitstackshape)
a <- c("a,b,c", "a,b", "a,b,c,d")
cSplit_e(as.data.table(a), "a", ",", type = "character", fill = 0)
# a a_a a_b a_c a_d
# 1: a,b,c 1 1 1 0
# 2: a,b 1 1 0 0
# 3: a,b,c,d 1 1 1 1
cSplit_e(as.data.table(a), "a", ",", type = "character", fill = 0, drop = TRUE)
# a_a a_b a_c a_d
# 1: 1 1 1 0
# 2: 1 1 0 0
# 3: 1 1 1 1
There's also mtabulate
from "qdapTools":
library(qdapTools)
mtabulate(strsplit(a, ","))
# a b c d
# 1 1 1 1 0
# 2 1 1 0 0
# 3 1 1 1 1
A very direct base R approach is to use table
along with stack
and strsplit
:
table(rev(stack(setNames(strsplit(a, ",", TRUE), seq_along(a)))))
# values
# ind a b c d
# 1 1 1 1 0
# 2 1 1 0 0
# 3 1 1 1 1
Another convoluted base-R solution:
x <- strsplit(a,",")
xl <- unique(unlist(x))
t(sapply(x,function(z)table(factor(z,levels=xl))))
which gives
a b c d
[1,] 1 1 1 0
[2,] 1 1 0 0
[3,] 1 1 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With