Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modify color scale legend guide to match line size in ggplot2

How do you override the aes size value for a ggplot2 legend guide based on a column in the data set?

Refer to this example (Edit 2: added Trial C, and changed the line size to use a log scale):

library(data.table)
set.seed(26798)

dt<-rbind(data.table(Trial="A",Value=rweibull(1000,1.0,0.5)),
      data.table(Trial="B",Value=rweibull(100,1.2,0.75)),
      data.table(Trial="C",Value=rweibull(10,1.3,0.8)))

# Add a count and something like a cumulative distribution:
dt2<-dt[order(Trial,Value),list(Value,N=.N),by=Trial][,list(Value,N,y=1-cumsum(N)/sum(N)),by=Trial]
dt2
##      Trial        Value    N     y
##   1:     A 0.0003628745 1000 0.999
##   2:     A 0.0013002615 1000 0.998
##   3:     A 0.0017002173 1000 0.997
##   4:     A 0.0022597343 1000 0.996
##   5:     A 0.0026608082 1000 0.995
##  ---                              
##1096:     B 1.6821827814  100 0.040
##1097:     B 2.2431595707  100 0.030
##1098:     B 2.5122479833  100 0.020
##1099:     B 2.5519954416  100 0.010
##1100:     B 2.6848412995  100 0.000

ggplot(dt2) +
  geom_line(aes(x=Value,y=y,group=Trial,color=Trial,size=N)) +
  scale_size(range=c(0.1, 2), trans="log") +
  guides(size=F, color=guide_legend(override.aes=list(size=2)))

Plot of three trials

I would like the line thickness for each value of Trial in the guide legend to match the line in the plot (i.e. "A" should be thick and "B" should be thin). Edit 1: @Arun and @ChelseaE gave good suggestions for adjusting each thickness manually, but my actual dataset has many factor levels and is constantly changing, so I need it to be "dynamic".

The answer from @DidzisElferts to a similar question (Control ggplot2 legend look without affecting the plot) shows how to set the size to a static value. The size=2 part in the last line of the example above lets me change the line size of the legend, but I would like it to match the size of the line in the plot. Using size=N instead seems logical, but it gives the error "object 'N' not found". What is the correct syntax?

Desired output:

Plot of three trials with desired legend

like image 560
dnlbrky Avatar asked Mar 23 '23 16:03

dnlbrky


1 Answers

You should set the sizes accordingly for both A and B. You've set just 1 size. Try this:

p <- ggplot(dt2) +
geom_line(aes(x=Value,y=y,group=Trial,color=Trial,size=N)) +
scale_size(range=c(0.1, 2)) +
guides(size=FALSE, color=guide_legend(override.aes=list(size=c(2, .1))))

Following OP's comment:

Okay, in that case, you'll have to do a bit more of work (There maybe easier ways; I can't think of them, if any, at the moment).

scales <- c(0.1, 2) # the range you want: min, max
vals <- summary(lm(scales ~ c(min(dt2$N), max(dt2$N))))$coefficients[,1]
sizes <- vals[2] * unique(dt2$N) + vals[1]

ggplot(dt2) +
geom_line(aes(x=Value,y=y,group=Trial,color=Trial,size=N)) +
scale_size(range=scales) +
guides(size=FALSE, color=guide_legend(override.aes=list(size=sizes)))

This should work. Try it and let me know if you've issues.

like image 163
Arun Avatar answered Apr 05 '23 21:04

Arun