Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correlation Corrplot Configuration

I am newbie in R scripts :-)

I need build a correlation matrix and I´am trying to configurate some parameters to adapt the graph. I am using the corrplot package.

I Built a corrplot matrix this way:

corrplot(cor(d1[,2:14], d1[,2:14]), method=c("color"),
         bg = "white", addgrid.col = "gray50", 
         tl.cex=1, type="lower", tl.col = "black", 
         col = colorRampPalette(c("red","white","blue"))(100))

I need show the values of correlation in the lower matrix inside the color matrix that I built. How i can do that?

Is it possible exclude the main diagonal from the lower matrix? In this diagonl always we have the perfect correlation.

The other doubt - I want to show the significant values for the correlation using stars instead of squares. like (*, , *). Is it possible?

Can you help me guys?

like image 257
Corintho Avatar asked Sep 25 '13 18:09

Corintho


People also ask

What correlation does Corrplot use?

[ R , PValue ] = corrplot( Tbl ) plots the Pearson's correlation coefficients between all pairs of variables in the table or timetable Tbl , and also returns tables for the correlation matrix R and matrix of p-values PValue .

What does Corrplot do in R?

R package corrplot provides a visual exploratory tool on correlation matrix that supports automatic variable reordering to help detect hidden patterns among variables. corrplot is very easy to use and provides a rich array of plotting options in visualization method, graphic layout, color, legend, text labels, etc.

How do you make a Corrplot bigger in R?

The correlation coefficient value size in correlation matrix plot created by using corrplot function ranges from 0 to 1, 0 referring to the smallest and 1 referring to the largest, by default it is 1. To change this size, we need to use number. cex argument.


1 Answers

With a bit of hackery you can do this in a very similar R package, corrgram. This one allows you to easily define your own panel functions, and helpfully makes theirs easy to view as templates. Here's the some code and figure produced:

set.seed(42)
library(corrgram)

# This panel adds significance starts, or NS for not significant
panel.signif <-  function (x, y, corr = NULL, col.regions, digits = 2, cex.cor, 
                           ...) {
  usr <- par("usr")
  on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  results <- cor.test(x, y, alternative = "two.sided")
  est <- results$p.value
  stars <- ifelse(est < 5e-4, "***", 
                  ifelse(est < 5e-3, "**", 
                         ifelse(est < 5e-2, "*", "NS")))
  cex.cor <- 0.4/strwidth(stars)
  text(0.5, 0.5, stars, cex = cex.cor)
}

# This panel combines edits the "shade" panel from the package
# to overlay the correlation value as requested
panel.shadeNtext <- function (x, y, corr = NULL, col.regions, ...) 
{
  if (is.null(corr)) 
    corr <- cor(x, y, use = "pair")
  ncol <- 14
  pal <- col.regions(ncol)
  col.ind <- as.numeric(cut(corr, breaks = seq(from = -1, to = 1, 
                                               length = ncol + 1), include.lowest = TRUE))
  usr <- par("usr")
  rect(usr[1], usr[3], usr[2], usr[4], col = pal[col.ind], 
       border = NA)
  box(col = "lightgray")
  on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- formatC(corr, digits = 2, format = "f")
  cex.cor <- .8/strwidth("-X.xx")
  text(0.5, 0.5, r, cex = cex.cor)
}

# Generate some sample data
sample.data <- matrix(rnorm(100), ncol=10)

# Call the corrgram function with the new panel functions
# NB: call on the data, not the correlation matrix
corrgram(sample.data, type="data", lower.panel=panel.shadeNtext, 
         upper.panel=panel.signif)

enter image description here

The code isn't very clean, as it's mostly patched together functions from the package, but it should give you a good start to get the plot you want. Possibly you can take a similar approach with the corrplot package too.

update: Here's a version with stars and cor on the same triangle:

panel.shadeNtext <- function (x, y, corr = NULL, col.regions, ...) 
{
  corr <- cor(x, y, use = "pair")
  results <- cor.test(x, y, alternative = "two.sided")
  est <- results$p.value
  stars <- ifelse(est < 5e-4, "***", 
                  ifelse(est < 5e-3, "**", 
                         ifelse(est < 5e-2, "*", "")))
  ncol <- 14
  pal <- col.regions(ncol)
  col.ind <- as.numeric(cut(corr, breaks = seq(from = -1, to = 1, 
                                               length = ncol + 1), include.lowest = TRUE))
  usr <- par("usr")
  rect(usr[1], usr[3], usr[2], usr[4], col = pal[col.ind], 
       border = NA)
  box(col = "lightgray")
  on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- formatC(corr, digits = 2, format = "f")
  cex.cor <- .8/strwidth("-X.xx")
  fonts <- ifelse(stars != "", 2,1)
  # option 1: stars:
  text(0.5, 0.4, paste0(r,"\n", stars), cex = cex.cor)
  # option 2: bolding:
  #text(0.5, 0.5, r, cex = cex.cor, font=fonts)
}

# Generate some sample data
sample.data <- matrix(rnorm(100), ncol=10)

# Call the corrgram function with the new panel functions
# NB: call on the data, not the correlation matrix
corrgram(sample.data, type="data", lower.panel=panel.shadeNtext, 
         upper.panel=NULL)

enter image description here

Also commented out is another way of showing significance, it'll bold those below a threshold rather than using stars. Might be clearer that way, depending on what you want to show.

like image 68
blmoore Avatar answered Sep 27 '22 23:09

blmoore