Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to scale (normalise) values of ggplot2 stat_bin2d within each column (by X axis)

I have a ggplot stat_bin2d "heatmap".

library(ggplot2)
value<-rep(1:5, 1000)
df<-as.data.frame(value)    
df$group<-rep(1:7, len=5000)
df<-df[sample(nrow(df), 3000), ]
ggplot(df, aes(factor(group), factor(value))) +stat_bin2d()

I have tried to add fill to aes:

aes(factor(group), factor(value),fill = (..count..)/mean(..count..))

as a way to mimic ..density.. (not accepted) does not seem to be accepted, but it is not what I am wanting - it seems to divide by the sum of the counts for the whole df. I want the count of values in each group (by x axis) normalised by the mean (or sum, or other stat) within the group. unfortunately, sum(..count..) seems to give the sum of the whole df, not only of the column.

like image 852
MartinT Avatar asked Nov 22 '22 14:11

MartinT


1 Answers

I know this post is ancient, but I came across it when trying to do the same thing and didn't want to use geom_tile. I was able to implement it with after_stat and a normalization function:

norm_across_y <- function(v, x, y){
    data.frame(v=v, x=x, y=y) %>%
        group_by(x) %>%
        mutate(v=v/((max(y)-min(y))/n()*sum(v))) %>%
        ungroup() %>%
        pull(v)
}

ggplot(data, aes(x=xvar, y=yvar)) +
    stat_density_2d_filled(aes(fill=after_stat(norm_across_y(density, x, y))), geom="raster", contour=FALSE, n=500) +
    geom_point(color="red", shape="x") +
    scale_x_continuous(expand=c(0,0)) +
    scale_y_continuous(expand=c(0,0)) +
    scale_fill_viridis_c(limits=c(0,NA))

Which normalizes each slice of the x axis such that the integral along the y axis would be 1 which was my use case.

like image 120
Matt Avatar answered Nov 24 '22 05:11

Matt