Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

exctract correlated elements of a correlation matrix

Tags:

r

correlation

I have a correlation matrix in R and I want to know how many groups (and put these groups into vectors) of elements correlate between them in more than 95%.

X <- matrix(0,3,5) 
X[,1] <- c(1,2,3)
X[,2] <- c(1,2.2,3)*2
X[,3] <- c(1,2,3.3)*3
X[,4] <- c(6,5,1)
X[,5] <- c(6.1,5,1.2)*4

cor.matrix <- cor(X)
cor.matrix <- cor.matrix*lower.tri(cor.matrix)
cor.vector <- which(cor.matrix>0.95, arr.ind=TRUE)

cor.vector then contains:

     row col 
[1,]   2   1 
[2,]   3   1 
[3,]   3   2 
[4,]   5   4 

That means, as expected, that the vectors 1,2 and 3 correlate between them, and also 4 and 5.

What I would need is to get two vectors c(1,2,3) and c(4,5) as the final result.

This is a simple example, I am processing large matrices though.

like image 683
Xavi Avatar asked Apr 25 '13 12:04

Xavi


1 Answers

Here's an approach using igraph package:

require(igraph)
g <- graph.data.frame(cor.vector, directed = FALSE)
split(unique(as.vector(cor.vector)), clusters(g)$membership)
# $`1`
# [1] 2 3 1

# $`2`
# [1] 5 4

What this essentially does is to find the clusters in the graph g (disconnected sets), as illustrated in the figure below. Since the vertices are used to create the graph in the order you entered (from your cor.vector), the clustering order also comes back in the same order. That is: for vertices c(2,3,5,1,4) the clusters are c(1,1,2,1,2) with a total of two clusters (cluster 1 and cluster 2). So, we just use this to split using the cluster group.

enter image description here

like image 91
Arun Avatar answered Oct 03 '22 09:10

Arun