Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

correlation matrix with names

Tags:

r

I have a matrix of about 1000 row X 500 variable, I am trying to establish a correlation matrix for these variables with names rather than numbers, so the outcome should look like this

variable1    variable2    variable3    variable4 ...
  mrv1         mrv2         mrv3          mrv4   ...
 smrv1        smrv2        smrv3          smrv4   ...
   .             .           .             .
   .             .           .             .
   .             .           .             .

where mrv1 = Most related variable to variable1, smrv1 = second most related variable and so on.

I have actually made the correlation matrix, but using a for loop and a very complicated command (probably the worst command of all time, but it actually works!). I am looking forward to establish this through a proper command, here's the command I am using now.

mydata <- read.csv("location", header=TRUE, sep=",")
lgn <- length(mydata)
crm <- cor(mydata)

k <- crm[,1]
K <- data.frame(rev(sort(k)))
A <- data.frame(rownames(K))

for (x in 2:lgn){
k <- crm[,x]
K <- data.frame(rev(sort(k)))
B <- data.frame(rownames(K)) 
A <- cbind(A,B)
}

Any ideas of a more simple, reliable command?

Thanks,

like image 699
Error404 Avatar asked Mar 14 '13 15:03

Error404


People also ask

What should be included in a correlation matrix?

A correlation matrix is used to summarize data, as an input into a more advanced analysis, and as a diagnostic for advanced analyses. Key decisions to be made when creating a correlation matrix include: choice of correlation statistic, coding of the variables, treatment of missing data, and presentation.

What is correlation matrix example?

Example of a Correlation MatrixEach cell in the table shows the correlation between two specific variables. For example, the highlighted cell below shows that the correlation between “hours spent studying” and “exam score” is 0.82, which indicates that they're strongly positively correlated.


1 Answers

Does this example work for what you want?

W <- rnorm( 10 )
X <- rnorm( 10 )
Y <- rnorm( 10 )
Z <- rnorm( 10 )

df <- round( cor( cbind( W , X , Y , Z ) ) , 2 )
df
#         W     X     Y     Z
#   W  1.00 -0.50 -0.36 -0.27
#   X -0.50  1.00 -0.42 -0.02
#   Y -0.36 -0.42  1.00  0.17
#   Z -0.27 -0.02  0.17  1.00


apply( df , 2 , FUN = function(x){ j <- rev(order(x)); y <- names(x)[j]  } )
#        W   X   Y   Z  
#   [1,] "W" "X" "Y" "Z"
#   [2,] "Z" "Z" "Z" "Y"
#   [3,] "Y" "Y" "W" "X"
#   [4,] "X" "W" "X" "W"


#And use abs() if you don't care about the direction of the correlation (negative or postive) just the magnitude
apply( df , 2 , FUN = function(x){ j <- rev(order(   abs(x)   )); y <- names(x)[j]  } )
#        W   X   Y   Z  
#   [1,] "W" "X" "Y" "Z"
#   [2,] "X" "W" "X" "W"
#   [3,] "Y" "Y" "W" "Y"
#   [4,] "Z" "Z" "Z" "X"
like image 173
Simon O'Hanlon Avatar answered Sep 29 '22 12:09

Simon O'Hanlon