Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Heatmap of regression lines

Suppose I run a bayesian simple linear regression. I would like to visualise the results by plotting multiple regression lines based on the posterior distributions of a (intercept) and b (slope). I am wondering how to display the results in a heatmap-like style or alternatively use transparency to avoid overlapping. Here's one simple ggplot approach.

library(ggplot2)
set.seed(123)

N = 1000
x = 1:80
a = rnorm(N,10,3)
b = rnorm(N,5,2)

y = vector("list",length=N)
for(i in 1:N) {y[[i]] = a[i]+b[i]*x}


df = data.frame(x=rep(x,N),y=unlist(y))
df$f = rep(1:N,each=80)

(plt <- ggplot(df, aes(x, y,group=f)) + 
  geom_jitter(alpha=1/30,width=5,col="blue") + theme_classic())

Are there better ways to do this? It would be nice if the colour would change depending on the amount of overlapping (as it is in heatmaps).

like image 948
beginneR Avatar asked Mar 11 '16 09:03

beginneR


People also ask

What is heatmap in linear regression?

A heatmap (aka heat map) depicts values for a main variable of interest across two axis variables as a grid of colored squares. The axis variables are divided into ranges like a bar chart or histogram, and each cell's color indicates the value of the main variable in the corresponding cell range.

How does heatmap regression work?

In vanilla heatmap regression, 1) during training, the ground truth numerical coordinates are first quantized to generate the ground truth heatmap; and 2) during testing, the predicted numerical coordinates can be decoded from the maximum activation point in the predicted heatmap.

How do you interpret heat map results?

You can think of a heat map as a data-driven “paint by numbers” canvas overlaid on top of an image. In short, an image is divided into a grid and within each square, the heat map shows the relative intensity of values captured by your eye tracker by assigning each value a color representation.

When should you use a heatmap?

Because of their reliance on color to communicate values, Heat Maps are perhaps most commonly used to display a more generalized view of numeric values. This is especially true when dealing with large volumes of data, as colors are easier to distinguish and make sense of than raw numbers.


2 Answers

Why not do a line plot with samples from the posterior

g = ggplot(df, aes(x, y)) + 
  geom_line(alpha=1/50,col="grey",aes(group=f)) + 
  theme_classic() 

You then then add a darker line for the posterior expection

g + stat_summary(geom="line", fun.y=mean, color="black", lwd=1)

To give

enter image description here

like image 167
csgillespie Avatar answered Sep 28 '22 08:09

csgillespie


Another way that you could do this is through the stat_density_2d function with ggplot2. There are a variety of ways to do this. Using your df...

As a heatmap

ggplot(df, aes(x = x, y=y))+
  stat_density_2d(aes(fill = ..density..), geom = "raster", contour = FALSE)+
  scale_fill_gradient(low = "blue", high = "red")+
  stat_summary(geom="line", fun.y=mean, color = "white",lwd=1)+
  theme_classic()

heatmap

Conversely, you could use points as well.

ggplot(df, aes(x = x, y=y))+
  stat_density_2d(aes(size = ..density..), geom = "point", contour = FALSE)+
  stat_summary(geom="line", fun.y=mean, color = "white",lwd=1)+
  theme_classic()

point density

like image 37
mfidino Avatar answered Sep 28 '22 07:09

mfidino