I'm having trouble to make from my data.frame a square matrix. Now my data looks something like this:
var1 var2 value
A B 4
C D 5
D A 2
B D 1
I'm trying to transform the data.frame to a matrix that looks like this:
A B C D
A 0 4 0 2
B 4 0 0 1
C 0 0 0 5
D 2 1 5 0
I tried many functions from the different package available in R but still cannot find a solution.
Here is a base R method using matrix indexing on character vectors.
## set up storage matrix
# get names for row and columns
nameVals <- sort(unique(unlist(dat[1:2])))
# construct 0 matrix of correct dimensions with row and column names
myMat <- matrix(0, length(nameVals), length(nameVals), dimnames = list(nameVals, nameVals))
# fill in the matrix with matrix indexing on row and column names
myMat[as.matrix(dat[c("var1", "var2")])] <- dat[["value"]]
This returns
myMat
A B C D
A 0 4 0 0
B 0 0 0 1
C 0 0 0 5
D 2 0 0 0
For details on how this powerful form of indexing works, see the Matrices and arrays section of the help file ?"["
. In particular, the fourth paragraph of the section discusses this form of indexing.
Note that I assume that the first two variables are character vectors rather then factors. This makes it a bit easier, since I don't have to use as.character
to coerce them.
To convert the result to a data.frame, simply wrap the above code in the as.data.frame
function.
data
dat <-
structure(list(var1 = c("A", "C", "D", "B"), var2 = c("B", "D",
"A", "D"), value = c(4L, 5L, 2L, 1L)), .Names = c("var1", "var2",
"value"), class = "data.frame", row.names = c(NA, -4L))
If we make all the character columns factor
s with levels 'A', 'B', 'C', 'D' then we can use xtabs
without dropping any columns.
Unfortunately, the resulting matrix isn't symmetric.
library('tidyverse')
df <- tribble(
~var1, ~var2, ~value,
'A', 'B', 4,
'C', 'D', 5,
'D', 'A', 2,
'B', 'D', 1
)
df %>%
mutate_if(is.character, factor, levels=c('A', 'B', 'C', 'D')) %>%
xtabs(value ~ var1 + var2, ., drop.unused.levels = F)
# var2
# var1 A B C D
# A 0 4 0 0
# B 0 0 0 1
# C 0 0 0 5
# D 2 0 0 0
To make it symmetric, I just added its transpose to itself. This feels like a bit of a hack, though.
df %>%
mutate_if(is.character, factor, levels=c('A', 'B', 'C', 'D')) %>%
xtabs(value ~ var1 + var2, ., drop.unused.levels = F) %>%
'+'(., t(.))
# var2
# var1 A B C D
# A 0 4 0 2
# B 4 0 0 1
# C 0 0 0 5
# D 2 1 5 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With