Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reshape correlation matrix to be stacked by column pairs

Tags:

r

matrix

reshape

In R, I use cov2cor() to calculate a correlation matrix like:

  A,B,C,...
A 1,0.5,0.2,...
B 0.5,1,0.4,...
C 0.2,0.4,1,...
...

How can I reshape the matrix so that the columns are stacked in rows like:

X,Y,Correlation
A,B,0.5,
A,C,0.2,
...
B,C,0.4,
...

Remind that A,As are excluded, and A,B B,A are treated as duplicates so that one are excluded.

Is there an easy way to implement this?

like image 296
Kun Ren Avatar asked Dec 06 '25 17:12

Kun Ren


1 Answers

The functions that you need are:

lower.tri {base} : This will allow you to take the correlation matrix and set the upper/lower triangle to NAs as well as exclude the diagonal. This will take care of the duplicate corr values i.e.,only one of these will be retained. cor(A,C)=cor(C,A)

melt{reshape2}: This will take the lower/upper triangle and melt it into a table with only three columns. The 3rd column will have the correlation between variable in col1 & col2.

is.na{Matrix}: Use this to remove rows where the 3rd column is NA.

Update: @KunRen has suggesed na.omit{base}as a better alternative to is.na which I agree with.

A sample solution would be like the following:

system.time(correlations<-cor(mydata,use="pairwise.complete.obs"))#get correlation matrix
upperTriangle<-upper.tri(correlations, diag=F) #turn into a upper triangle
correlations.upperTriangle<-correlations #take a copy of the original cor-mat
correlations.upperTriangle[!upperTriangle]<-NA#set everything not in upper triangle o NA
correlations_melted<-na.omit(melt(correlations.upperTriangle, value.name ="correlationCoef")) #use melt to reshape the matrix into triplets, na.omit to get rid of the NA rows
colnames(correlations_melted)<-c("X1", "X2", "correlation")

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!