Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate graphs in R for certain correlations in a matrix

I want to generate graphs between variables (columns) that have a correlation above and below a certain point as well as having a pvalue < 0.01. The graphs would be ggplot2 (line or bar) graphs plotting the two columns (variables) that correlate.

Here is the gist of my approach so far, with some dummy data, I would love a pointer in where to go next.

# Create some dummy data
df <- data.frame(sample(1:50), sample(1:50), sample(1:50), sample(1:50))
colnames(df) <- c("var1", "var2", "var3", "var4")

# Find correlations in the dummy data
df.cor <- cor(df)

# Make up some random pvalues for this example
x <- 0:1000
df.cor.pvals <- data.frame(sample(x/1000, 4), sample(x/1000, 4), sample(x/1000, 4), sample(x/1000,4))
colnames(df.cor.pvals) <- c("var1", "var2", "var3", "var4")

# Find the significant correlations
df.cor.extreme <- ((df.cor < -0.01 | df.cor > 0.01) & df.cor.pvals < 0.5)

# Ready data to for plotting
df$rownames <- rownames(df)
df.melt <- melt(df, id="rownames")

# I want to plot the combinations of variables that have a TRUE value
# in the df.cor.extreme matrix 

Below is hardcoded example if var1 and var2 had a value of TRUE. I assume this is where I need some sort of loop to generate multiple plots where varA and varB are correlated.

ggplot(df.melt[(df.melt$variable=="var1" | df.melt$variable=="var2"),], aes(x=rownames, y=value, group=variable, colour=variable)) +
  geom_line()

Example plot

like image 841
themartinmcfly Avatar asked Dec 28 '12 05:12

themartinmcfly


People also ask

Can we plot correlation matrix for categorical variables?

Yes, it is possible if you also keep the variable type in a column and you pick the appropriate correlation method based on the types.

How do you graph a correlation table?

Plot using Heatmaps There are many ways you can plot correlation matrices one efficient way is using the heatmap. It is very easy to understand the correlation using heatmaps it tells the correlation of one feature(variable) to every other feature(variable).

How do you do multiple correlations in R?

In this method, the user has to call the cor() function and then within this function the user has to pass the name of the multiple variables in the form of vector as its parameter to get the correlation among multiple variables by specifying multiple column names in the R programming language.


1 Answers

As said in the comment by @DrewSteen , p-avlue must be the same shape of cor.

Here I supply a function that compute p-value matrix( it should exist a build-in function, in stats package)

pvalue.matrix <- function(x,...){
  ncx <- ncol(x)
  r <- matrix(0, nrow = ncx, ncol = ncx)
  for (i in seq_len(ncx)) {
    for (j in seq_len(i)) {
      x2 <- x[, i]
      y2 <- x[, j]
      r[i, j] <-  cor.test(x2,y2,...)$p.value
    }
  }
  r <- r + t(r) - diag(diag(r))
  rownames(r) <- colnames(x)
  colnames(r) <- colnames(x)
  r
}

Then you use the vectorize version of | and & like this

df.cor.sig <- (df.cor > 0.01 | df.cor < -0.01) & pvalue.matrix(df) < 0.5

the plot is classic with geom_tile

library(reshape2) ## melt
library(plyr)     ## round_any
 library(ggplot2) 
dat <- expand.grid(var1=1:4, var2=1:4)
dat$value <- melt(df.cor.sig)$value
dat$labels <- paste(round_any(df.cor,0.01) ,'(', round_any(pvalue.matrix(df),0.01),')',sep='')
ggplot(dat, aes(x=var1,y=var2,label=labels))+ 
  geom_tile(aes(fill = value),colour='white')+
 geom_text()

enter image description here

Edit after OP clarification

plots <- apply(dat,1,function(x){
    plot.grob <- nullGrob()
    if(length(grep(pattern='TRUE',x[3])) >0 ){
      gg <- paste('var',c(x[1],x[2]),sep='')
      p <- ggplot(subset(df.melt,variable %in% gg ), 
            aes(x=rownames, y=value, group=variable, colour=variable)) +
            geom_line()
      plot.grob <- ggplotGrob(p)
    }
    plot.grob

})


library(gridExtra)
do.call(grid.arrange,  plots)

enter image description here

like image 154
agstudy Avatar answered Sep 20 '22 00:09

agstudy