Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Display Correlation Tables as Descending List

Tags:

r

correlation

When running cor() on a times series with a lot of variables, I get a table back that has a row and column for each variable, showing the correlation between them.

How can I view this table as a list from most correlated to least correlated (eliminating all NA results and results that map back to themselves (i.e. the correlation of A to A)). I would also like to count inverse (negative) results as absolute values, but still show them as negative.

So the desired output would be something like:

A,B,0.98
A,C,0.9
C,R,-0.8
T,Z,0.5
like image 201
Kyle Brandt Avatar asked Jul 21 '11 20:07

Kyle Brandt


People also ask

How do you display correlation data?

The simplest way to visualize correlation is to create a scatter plot of the two variables. A typical example is shown to the right. (Click to enlarge.) The graph shows the heights and weights of 19 students.

How do you report a correlation table in APA?

We use the following general structure to report a Pearson's r in APA format: A Pearson correlation coefficient was computed to assess the linear relationship between [variable 1] and [variable 2]. There was a [negative or positive] correlation between the two variables, r(df) = [r value], p = [p-value].

Does scaling affect correlation?

Since the formula for calculating the correlation coefficient standardizes the variables, changes in scale or units of measurement will not affect its value.


1 Answers

Here's one of many ways I could think to do this. I used the reshape package because the melt() syntax was easy for me to remember, but the melt() command could pretty easily be done with base R commands:

require(reshape)
## set up dummy data
a <- rnorm(100)
b <- a + (rnorm(100, 0, 2))
c <- a + b + (rnorm(100)/10)
df <- data.frame(a, b, c)
c <- cor(df)
## c is the correlations matrix

## keep only the lower triangle by 
## filling upper with NA
c[upper.tri(c, diag=TRUE)] <- NA

m <- melt(c)

## sort by descending absolute correlation
m <- m[order(- abs(m$value)), ]

## omit the NA values
dfOut <- na.omit(m)

## if you really want a list and not a data.frame
listOut <- split(dfOut, 1:nrow(dfOut))
like image 80
JD Long Avatar answered Nov 06 '22 06:11

JD Long