I have a list of named vectors (see below and at end for dput
version) I would like to "merge" together to make a matrix and fill in zeros if a vector doesn't contain a name (character in this case). This doesn't seem that hard but I haven't found a working base solution to the problem. I thought about using match but that seems very costly of time when I'm sure there's a fancy way to use do.call
and rbind
together.
List of Named Vectors:
$greg
e i k l
1 2 1 1
$sam
! c e i t
1 1 1 2 1
$teacher
? c i k l
1 1 1 1 1
Final Desired Output
! ? c e i k l t
greg 0 0 0 1 2 1 1 0
sam 1 0 1 1 2 0 0 1
teacher 0 1 1 0 1 1 1 0
Likely this is the output people will give and filling NAs with 0 is easy
! ? c e i k l t
greg NA NA NA 1 2 1 1 NA
sam 1 NA 1 1 2 NA NA 1
teacher NA 1 1 NA 1 1 1 NA
Sample Data
L2 <- structure(list(greg = structure(c(1L, 2L, 1L, 1L), .Dim = 4L, .Dimnames = structure(list(
c("e", "i", "k", "l")), .Names = ""), class = "table"), sam = structure(c(1L,
1L, 1L, 2L, 1L), .Dim = 5L, .Dimnames = structure(list(c("!",
"c", "e", "i", "t")), .Names = ""), class = "table"), teacher = structure(c(1L,
1L, 1L, 1L, 1L), .Dim = 5L, .Dimnames = structure(list(c("?",
"c", "i", "k", "l")), .Names = ""), class = "table")), .Names = c("greg",
"sam", "teacher"))
Here's a fairly straight forward base solution:
# first determine all possible column names
cols <- sort(unique(unlist(lapply(L2,names), use.names=FALSE)))
# initialize the output
out <- matrix(0, length(L2), length(cols), dimnames=list(names(L2),cols))
# loop over list and fill in the matrix
for(i in seq_along(L2)) {
out[names(L2)[i], names(L2[[i]])] <- L2[[i]]
}
UPDATE with benchmarks:
f1 <- function(L2) {
cols <- sort(unique(unlist(lapply(L2,names), use.names=FALSE)))
out <- matrix(0, length(L2), length(cols), dimnames=list(names(L2),cols))
for(i in seq_along(L2)) out[names(L2)[i], names(L2[[i]])] <- L2[[i]]
out
}
f2 <- function(L2) {
L.names <- sort(unique(unlist(sapply(L2, names))))
L3 <- t(sapply(L2, function(x) x[L.names]))
colnames(L3) <- L.names
L3[is.na(L3)] <- 0
L3
}
f3 <- function(L2) {
m <- do.call(rbind, lapply(L2, as.data.frame))
m$row <- sub("[.].*", "", rownames(m))
m$Var1 <- factor(as.character(m$Var1))
xtabs(Freq ~ row + Var1, m)
}
library(rbenchmark)
benchmark(f1(L2), f2(L2), f3(L2), order="relative")[,1:5]
# test replications elapsed relative user.self
# 1 f1(L2) 100 0.022 1.000 0.020
# 2 f2(L2) 100 0.051 2.318 0.052
# 3 f3(L2) 100 0.788 35.818 0.760
set.seed(21)
L <- replicate(676, {n=sample(10,1); l=sample(26,n);
setNames(sample(6,n,TRUE), letters[l])}, simplify=FALSE)
names(L) <- levels(interaction(letters,LETTERS))
benchmark(f1(L), f2(L), order="relative")[,1:5]
# test replications elapsed relative user.self
# 1 f1(L) 100 1.84 1.000 1.828
# 2 f2(L) 100 4.24 2.304 4.220
I think something like this:
names <- sort(unique(unlist(lapply(L2, names), use.names=FALSE)))
L3 <- t(vapply(L2, function(x) x[names], FUN.VALUE=numeric(length(names))))
colnames(L3) <- names
L3[is.na(L3)] <- 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With