Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improve performance of ggplotly when plotting time-series heatmap

I'm building an interactive time-series heatmap in R using Plotly and Shiny. As part of this process, I'm re-coding heatmap values from continuous to ordinal format - so I have a heatmap where six colours represent specific count categories, and those categories are created from aggregated count values. However, this causes a major performance issue with the speed of the creation of heatmap using ggplotly(). I've traced it to the tooltip() function from Plotly which renders interactive boxes. Labels data from my heatmap somehow overload this function in a way that it performs very slowly, even if I just add a single label component to the tooltip(). I'm using a processed subset of COVID-19 outbreak data from Johns Hopkins CSSE repository. Here is a simplified heatmap code, which also uses The Simpsons colour theme from ggsci:

#Load packages
library(shiny)
library(plotly)
library(tidyverse)
library(RCurl)
library(ggsci)

#Read example data from Gist
confirmed <- read_csv("https://gist.githubusercontent.com/GeekOnAcid/5638e37c688c257b1c381a15e3fb531a/raw/80ba9704417c61298ca6919343505725b8b162a5/covid_selected_europe_subset.csv")

#Wrap ggplot of time-series heatmap in ggplotly, call "tooltip"  
ggplot_ts_heatmap <- confirmed %>%
  ggplot(aes(as.factor(date), reorder(`Country/Region`,`cases count`), 
             fill=cnt.cat, label = `cases count`, label2 = as.factor(date), 
             text = paste("country:", `Country/Region`))) + 
  geom_tile(col=1) +
  theme_bw(base_line_size = 0, base_rect_size = 0, base_size = 10) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),legend.title = element_blank()) +
  scale_fill_manual(labels = levels(confirmed$cnt.cat),
                    values = pal_simpsons("springfield")(7)) +
  labs(x = "", y = "")
ggplotly(ggplot_ts_heatmap, tooltip = c("text","label","label2"))

Performance improves once tooltip = c("text","label","label2") is reduced (for instance to tooltip = c("text")). Now, I know that delay is not "massive", but I'm integrating this with a Shiny app. And once it's integrated with Shiny and scaled with more data, it is really, really, really slow. I don't even show all variables in tooltip and its still slow - you can see it in the current version of the app when you click on 'confirmed' cases.

Any suggestions? I've considered alternative interactive heatmap packages like d3heatmap, heatmaply and shinyHeatmaply but all those solutions are more intended for correlation heatmaps and they lack customisation options of ggplot.

enter image description here

like image 960
Geek On Acid Avatar asked Mar 19 '20 18:03

Geek On Acid


1 Answers

If you rewrite it as "pure" plotly (without the ggplotly conversion), it will be much faster. Around 3000 times even. Here's the result of a very small benchmark:

Unit: milliseconds
 expr       min        lq       mean     median        uq       max neval
    a 9929.8299 9929.8299 9932.49130 9932.49130 9935.1527 9935.1527     2
    b    3.1396    3.1396    3.15665    3.15665    3.1737    3.1737     2

The reason why ggplotly is much slower, is that it doesnt recognize the input as a heatmap and creates a scatterplot where each rectangle is drawn separately with all the necessary attributes. You can look at the resulting JSON if you wrap the result of ggplotly or plot_ly in plotly_json().

You can also inspect the object.size of the plots, where you will see that the ggplotly object is around 4616.4 Kb and the plotly-heatmap is just 40.4 Kb big.

df_colors = data.frame(range=c(0:13), colors=c(0:13))
color_s <- setNames(data.frame(df_colors$range, df_colors$colors), NULL)
for (i in 1:14) {
  color_s[[2]][[i]] <- pal_simpsons("springfield")(13)[[(i + 1) / 2]]
  color_s[[1]][[i]] <-  i / 14 - (i %% 2) / 14
}

plot_ly(data = confirmed, text = text) %>%
  plotly::add_heatmap(x = ~as.factor(date), 
                      y = ~reorder(`Country/Region`, `cases count`),
                      z = ~as.numeric(factor(confirmed$`cnt.cat`, ordered = T, 
                                             levels = unique(confirmed$`cnt.cat`))),
                      xgap = 0.5,
                      ygap = 0.5,
                      colorscale = color_s,
                      colorbar = list(tickmode='array',
                                      title = "Cases",
                                      tickvals=c(1:7),
                                      ticktext=levels(factor(x = confirmed$`cnt.cat`,
                                                             levels = unique(confirmed$`cnt.cat`),
                                                             ordered = TRUE)), len=0.5),
                      text = ~paste0("country: ", `Country/Region`, "<br>",
                                    "Number of cases: ", `cases count`, "<br>",
                                    "Category:  ", `cnt.cat`),
                      hoverinfo ="text"
  ) %>% 
  layout(plot_bgcolor='black',
         xaxis = list(title = ""),
         yaxis = list(title = ""));
like image 69
SeGa Avatar answered Oct 18 '22 01:10

SeGa