I have a matrix of about 1000 row X 500 variable, I am trying to establish a correlation matrix for these variables with names rather than numbers, so the outcome should look like this
variable1 variable2 variable3 variable4 ...
mrv1 mrv2 mrv3 mrv4 ...
smrv1 smrv2 smrv3 smrv4 ...
. . . .
. . . .
. . . .
where mrv1 = Most related variable to variable1, smrv1 = second most related variable and so on.
I have actually made the correlation matrix, but using a for loop and a very complicated command (probably the worst command of all time, but it actually works!). I am looking forward to establish this through a proper command, here's the command I am using now.
mydata <- read.csv("location", header=TRUE, sep=",")
lgn <- length(mydata)
crm <- cor(mydata)
k <- crm[,1]
K <- data.frame(rev(sort(k)))
A <- data.frame(rownames(K))
for (x in 2:lgn){
k <- crm[,x]
K <- data.frame(rev(sort(k)))
B <- data.frame(rownames(K))
A <- cbind(A,B)
}
Any ideas of a more simple, reliable command?
Thanks,
A correlation matrix is used to summarize data, as an input into a more advanced analysis, and as a diagnostic for advanced analyses. Key decisions to be made when creating a correlation matrix include: choice of correlation statistic, coding of the variables, treatment of missing data, and presentation.
Example of a Correlation MatrixEach cell in the table shows the correlation between two specific variables. For example, the highlighted cell below shows that the correlation between “hours spent studying” and “exam score” is 0.82, which indicates that they're strongly positively correlated.
Does this example work for what you want?
W <- rnorm( 10 )
X <- rnorm( 10 )
Y <- rnorm( 10 )
Z <- rnorm( 10 )
df <- round( cor( cbind( W , X , Y , Z ) ) , 2 )
df
# W X Y Z
# W 1.00 -0.50 -0.36 -0.27
# X -0.50 1.00 -0.42 -0.02
# Y -0.36 -0.42 1.00 0.17
# Z -0.27 -0.02 0.17 1.00
apply( df , 2 , FUN = function(x){ j <- rev(order(x)); y <- names(x)[j] } )
# W X Y Z
# [1,] "W" "X" "Y" "Z"
# [2,] "Z" "Z" "Z" "Y"
# [3,] "Y" "Y" "W" "X"
# [4,] "X" "W" "X" "W"
#And use abs() if you don't care about the direction of the correlation (negative or postive) just the magnitude
apply( df , 2 , FUN = function(x){ j <- rev(order( abs(x) )); y <- names(x)[j] } )
# W X Y Z
# [1,] "W" "X" "Y" "Z"
# [2,] "X" "W" "X" "W"
# [3,] "Y" "Y" "W" "Y"
# [4,] "Z" "Z" "Z" "X"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With