Suppose I have a data frame consisting of 20 columns (variables) and all of them are numeric. I can always use the cor
function in R to get the correlation coefficients in matrix form or actually visualize the correlation matrix (with correlation coefficients labeled as well). Suppose I just want to sort the pairs according to the correlation coefficients value, how to do this in R ?
If the value is 0, the two variables are independent and there is no correlation. If the measure is extremely close to one of these values, it indicates a linear relationship and highly correlated with each other. This means a change in one variable is associated with a significant change in other variables.
The Pearson's correlation coefficient is calculated as the covariance of the two variables divided by the product of the standard deviation of each data sample. It is the normalization of the covariance between the two variables to give an interpretable score.
Correlation coefficients whose magnitude are between 0.9 and 1.0 indicate variables which can be considered very highly correlated. Correlation coefficients whose magnitude are between 0.7 and 0.9 indicate variables which can be considered highly correlated.
The variables with correlation coefficient values closer to 1 show a strong positive correlation, the values closer to -1 show a strong negative correlation, and the values closer to 0 show weak or no correlation.
Solution using corrr:
corrr is a package for exploring correlations in R. It focuses on creating and working with data frames of correlations
library(corrr)
matrix(rnorm(100), 5) %>%
correlate() %>%
stretch() %>%
arrange(r)
Solution using reshape2 & data.table:
You can reshape2::melt
(imported with data.table
) cor
result and order (sort) according correlation values.
library(data.table)
corMatrix <- cor(matrix(rnorm(100), 5))
setDT(melt(corMatrix))[order(value)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With