Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to include density coloring in pairwise correlation scatter plot

I have the following code:

library(GGally)
library(nycflights13)
library(tidyverse)

dat <- nycflights13::flights %>% 
       select(dep_time, sched_dep_time, dep_delay,  arr_time, sched_arr_time, arr_delay)  %>% 
       sample_frac(0.01)
dat
ggpairs(dat)

It produces this:

enter image description here

How can I add the density coloring so that it looks like this:

enter image description here

like image 783
pdubois Avatar asked Jul 07 '17 02:07

pdubois


People also ask

How do you interpret a pairwise scatter plot?

A scatter plot matrix shows all pairwise scatter plots for many variables. If the variables tend to increase and decrease together, the association is positive. If one variable tends to increase as the other decreases, the association is negative. If there is no pattern, the association is zero.

How do you make a pairwise scatter plot in R?

To create a Pair Plot in the R Language, we use the pairs() function. The pairs function is provided in R Language by default and it produces a matrix of scatterplots. The pairs() function takes the data frame as an argument and returns a matrix of scatter plots between each pair of variables in the data frame.

How do you describe an association in a scatter plot?

If one variable increases as the other variable increases, there is said to be a positive association. If one variable increases as the other variable decreases, there is said to be a negative association. If there is no relationship between the variables, then the points in the scatterplot have no association.

What is a pairwise plot?

A pairs plot is a matrix of scatterplots that lets you understand the pairwise relationship between different variables in a dataset.


1 Answers

Using ideas from How to reproduce smoothScatter's outlier plotting in ggplot? , R - Smoothing color and adding a legend to a scatterplot, and How to use loess method in GGally::ggpairs using wrap function you can define your own function to pass to ggpairs.

my_fn <- function(data, mapping, ...){
      p <- ggplot(data = data, mapping = mapping) + 
        stat_density2d(aes(fill=..density..), geom="tile", contour = FALSE) +
        scale_fill_gradientn(colours=rainbow(100))
      p
}

ggpairs(dat, lower=list(continuous=my_fn))

EDIT

From comment: How do you add histogram in the diagonal and remove "Corr:" in the correlation value?

You can set the diagonal and upper arguments. So to add the histogram (assuming you mean geom_histogram) you can use diag=list(continuous=wrap("barDiag", binwidth=100)) and to remove the correlation completely use upper=list(continuous="blank"). If you want to actually remove the text *corr:*, you will need to define a new function - please see the function cor_fun at Change colors in ggpairs now that params is deprecated .

So your plot becomes

ggpairs(dat, lower=list(continuous=my_fn),
        diag=list(continuous=wrap("barDiag", binwidth=100)),
        upper=list(continuous=wrap(cor_fun, sz=10, stars=FALSE))
        )

enter image description here

EDIT

From comment: How do you color the diagonal histogram like in OP?

To colour just add the relevant arguments to the barDiag function, in this case fill and colour. So diag would then be

diag=list(continuous=wrap("barDiag", binwidth=100, fill="brown", col="black")) 

(fill gives the main colour, and col gives the colour to outline the bars)

like image 157
user20650 Avatar answered Sep 24 '22 01:09

user20650