I have a dataset that looks something like this, but much larger
x.col<-c(1,1,1,1,2,2,2,3,3,4)
y.col<-c(2,3,4,5,3,4,5,4,5,5)
response<-c(1,0,1,1,1,1,0,0,0,0)
ds<-data.frame(cbind(x.col,y.col,response))
From these data, I would like to create a matrix in which rows and columns are identical, and values in the cells represent the response between x and y. The output would then look something like this:
one<-c(NA,1,0,1,1)
two<-c(1,NA,1,1,0)
three<-c(0,1,NA,0,0)
four<-c(1,1,0,NA,0)
five<-c(1,0,0,0,NA)
mx<-cbind(one,two,three,four,five)
row.names(mx)<-c(1,2,3,4,5)
colnames(mx)<-c(1,2,3,4,5)
note that diagnals are "NAs" because they refer to cells in which x and y values are identical
You can try
Un <- unique(unlist(ds[1:2]))
m1 <- matrix(0, length(Un),length(Un), dimnames=list(Un, Un))
m1[as.matrix(ds[1:2])] <- ds[,3]
m1 <- m1+t(m1)
diag(m1) <- NA
identical(m1, mx)
#[1] TRUE
Based on the new dataset, this may work
ds1 <- read.csv('lulc.mean21apr2015.csv')
library(data.table)#v1.9.5+
Un1 <- unique(unlist(ds1[2:3]))
res <- dcast(setDT(ds1), factor(id.origin, levels=Un1)~factor(id.dest,
levels=Un1), value.var='lulc')
for(j in 1:ncol(res)){
set(res, i=which(is.na(res[[j]])), j=j, value=0)
}
res1 <- as.matrix(res[,-1, with=FALSE])
row.names(res1) <- res[[1]]
res1[1:3,1:3]
# 9606 25216 12865
#9606 0 1 0
#25216 1 0 1
#12865 0 1 0
Or a modification of the previous solution
m1 <- matrix(0, length(Un1), length(Un1), dimnames=list(Un1, Un1))
indx <- do.call(cbind,lapply(ds1[2:3],
function(x) as.numeric(factor(x, levels=Un1))))
m1[indx] <- ds1[,4]
all.equal(m1, res1)
#[1] TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With