I am plotting a scatterplot matrix using ggpairs
. I am using the following code:
# Load required packages
require(GGally)
# Load datasets
data(state)
df <- data.frame(state.x77,
State = state.name,
Abbrev = state.abb,
Region = state.region,
Division = state.division
)
# Create scatterplot matrix
p <- ggpairs(df,
# Columns to include in the matrix
columns = c(3,5,6,7),
# What to include above diagonal
# list(continuous = "points") to mirror
# "blank" to turn off
upper = "blank",
legends=T,
# What to include below diagonal
lower = list(continuous = "points"),
# What to include in the diagonal
diag = list(continuous = "density"),
# How to label inner plots
# internal, none, show
axisLabels = "none",
# Other aes() parameters
colour = "Region",
title = "State Scatterplot Matrix"
)
# Show the plot
print(p)
and I get the following plot:
Now, one can easily see that I am getting legends for every plot in the matrix. I would like to have ONLY ONE universal legend for the whole plot. How do I do that? Any help would be much appreciated.
I am working on something similar, this is the approach i would take,
ggpairs
function callNow iterate over the subplots in the plot matrix and remove the legends for each of them and just retain one of them since the densities are all plotted on the same column.
colIdx <- c(3,5,6,7)
for (i in 1:length(colIdx)) {
# Address only the diagonal elements
# Get plot out of matrix
inner <- getPlot(p, i, i);
# Add any ggplot2 settings you want (blank grid here)
inner <- inner + theme(panel.grid = element_blank()) +
theme(axis.text.x = element_blank())
# Put it back into the matrix
p <- putPlot(p, inner, i, i)
for (j in 1:length(colIdx)){
if((i==1 & j==1)){
# Move legend right
inner <- getPlot(p, i, j)
inner <- inner + theme(legend.position=c(length(colIdx)-0.25,0.50))
p <- putPlot(p, inner, i, j)
}
else{
# Delete legend
inner <- getPlot(p, i, j)
inner <- inner + theme(legend.position="none")
p <- putPlot(p, inner, i, j)
}
}
}
Hopefully, someone will show how this can be done with ggpairs(...)
. I'd like to see that myself. Until then, here is a solution that does not use ggpairs(...)
, but rather plain vanilla ggplot
with facets.
library(ggplot2)
library(reshape2) # for melt(...)
library(plyr) # for .(...)
library(data.table)
xx <- with(df, data.table(id=1:nrow(df), group=Region, df[,c(3,5,6,7)]))
yy <- melt(xx,id=1:2, variable.name="H", value.name="xval")
setkey(yy,id,group)
ww <- yy[,list(V=H,yval=xval),key="id,group"]
zz <- yy[ww,allow.cartesian=T]
setkey(zz,H,V,group)
zz <- zz[,list(id, group, xval, yval, min.x=min(xval),min.y=min(yval),
range.x=diff(range(xval)),range.y=diff(range(yval))),by="H,V"]
d <- zz[H==V,list(x=density(xval)$x,
y=min.y+range.y*density(xval)$y/max(density(xval)$y)),
by="H,V,group"]
ggplot(zz)+
geom_point(subset= .(xtfrm(H)<xtfrm(V)),
aes(x=xval, y=yval, color=factor(group)),
size=3, alpha=0.5)+
geom_line(subset= .(H==V), data=d, aes(x=x, y=y, color=factor(group)))+
facet_grid(V~H, scales="free")+
scale_color_discrete(name="Region")+
labs(x="", y="")
The basic idea is to melt(...)
your df
into the proper format for ggplot
(xx
), make two copies (yy
and ww
) and run a cartesian join based on id
and group
(here, id
is just a row number and group
is the Region
variable), to create zz
. We do need to calculate and scale the densities externally (in the data table d
). In spite of all that, it still runs faster than ggpairs(...)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With