Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to display coloured group correlations with scale_colour_manual in ggpairs (R)?

I'm using ggpairs for data with 3 groups. The problem is that not all variables have all groups and therefore, some correlations only need to show 2 groups. Because of the automatic alphabetical ordering of the groups by ggpairs, the colouring is not consistent. The first colour is always assigned to the first factor level. (For example: group 1 = red, group 2 = blue, group 3 = green. But with variables having only the second and last group: group 2 = red and group 3 = blue.)

I tried to solve this problem myself by adding a scale_colour_manual in the following way:

scale_colour_manual(values = c("group1"="#F8766D", "group2"="#00BA38", "group3"="#619CFF"))

This seems to work for the density plots on the diagonal (ggally_densityDiag) and for the scatter plots in the lower part (ggally_points), but for the correlations (ggally_cor) I only get the overal (black) correlations and none of the coloured group correlations anymore. While they were displayed before, but with wrong matching of colours and groups. Why are they not displayed anymore?

Following code generates this plot, the colours and groups are not matching.

ggpairs(output.b[,c(13,17,18)], aes(colour = as.factor(output.b$country), alpha = 0.4),
upper = list(continuous = function(data, mapping, ...) {
  ggally_cor(data = output.b, mapping = mapping) + scale_colour_manual(values = c("#F8766D", "#00BA38", "#619CFF"))}),
lower = list(continuous = function(data, mapping, ...) {
  ggally_points(data = output.b, mapping = mapping) + scale_colour_manual(values = c("#F8766D", "#00BA38", "#619CFF"))}),
diag = list(continuous = function(data, mapping, ...) {
  ggally_densityDiag(data = output.b, mapping = mapping) + scale_fill_manual(values = c("#F8766D", "#00BA38", "#619CFF"))}))

The adapted code generated this plot, the coloured group correlations are not displayed anymore.

ggpairs(output.b[,c(13,17,18)], aes(colour = as.factor(output.b$country), alpha = 0.4),
upper = list(continuous = function(data, mapping, ...) {
  ggally_cor(data = output.b, mapping = mapping) + scale_colour_manual(values = c("group1"="#F8766D", "group2"="#00BA38", "group3"="#619CFF"))}),
lower = list(continuous = function(data, mapping, ...) {
  ggally_points(data = output.b, mapping = mapping) + scale_colour_manual(values = c("group1"="#F8766D", "group2"="#00BA38", "group3"="#619CFF"))}),
diag = list(continuous = function(data, mapping, ...) {
  ggally_densityDiag(data = output.b, mapping = mapping) + scale_fill_manual(values = c("group1"="#F8766D", "group2"="#00BA38", "group3"="#619CFF"))}))
like image 206
lvdb Avatar asked Oct 21 '25 17:10

lvdb


1 Answers

I had the same issue. I just re-wrote a better version of the ggally_cor function from scratch. The only thing you need to do is specify "Overall Corr"="black" in scale_color_manual

library(dplyr)
library(ggplot2)
library(GGally)

# set dplyr functions
select <- dplyr::select; rename <- dplyr::rename; mutate <- dplyr::mutate; 
summarize <- dplyr::summarize; arrange <- dplyr::arrange; slice <- dplyr::slice; filter <- dplyr::filter; recode<-dplyr::recode

# remove obs for setosa
data = iris %>% mutate(Sepal.Length = ifelse(Species=="setosa",NA,Sepal.Length))

mycorrelations <- function(data,mapping,...){
    data2 = data
    data2$x = as.numeric(data[,as_label(mapping$x)])
    data2$y = as.numeric(data[,as_label(mapping$y)])
    data2$group = data[,as_label(mapping$colour)]
    
    correlation_df = data2 %>% 
        bind_rows(data2 %>% mutate(group="Overall Corr")) %>%
        group_by(group) %>% 
        filter(sum(!is.na(x),na.rm=T)>1) %>%
        filter(sum(!is.na(y),na.rm=T)>1) %>%
        summarize(estimate = round(as.numeric(cor.test(x,y,method="spearman")$estimate),2),
                  pvalue = cor.test(x,y,method="spearman")$p.value,
                  pvalue_star = as.character(symnum(pvalue, corr = FALSE, na = FALSE, 
                                                    cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), 
                                                    symbols = c("***", "**", "*", "'", " "))))%>%
        group_by() %>%
        mutate(group = factor(group, levels=c(as.character(unique(sort(data[,as_label(mapping$colour)]))), "Overall Corr")))
    
    ggplot(data=correlation_df, aes(x=1,y=group,color=group))+
        geom_text(aes(label=paste0(group,": ",estimate,pvalue_star)))
}


ggpairs(data,columns=1:4,
        mapping = ggplot2::aes(color=Species), 
        upper = list(continuous = mycorrelations))+
    scale_color_manual(values=c("setosa"="orange","versicolor"="purple","virginica"="brown","Overall Corr"="black"))

enter image description here

like image 103
Isaac Zhao Avatar answered Oct 24 '25 08:10

Isaac Zhao