Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: crosstalk::SharedData linked data with different format (wide/long), when tidyverse verbs don't work

Edit TL;DR Using crosstalk package, I am searching for a way to link a graph that utilizes long format data (a line plot) with an interactive table with data in wide format so that each row in table corresponds to a line in the plot.

I am trying to link a DT table with a plotly graph. My trouble revolves around the fact that the graph needs data in long format, while the table would be in wide format. I probably got fixated on the tidyverse way of doing things. I will try to provide a minimal example of what I am trying to do and what I would like to obtain.

Setup:

library(tidyverse)
library(crosstalk)
library(plotly)
library(DT)

# Wide format
df_test1 <- data.frame(
  id = c("id1", "id2"),
  item1 = c(0, 4),
  item2 = c(3, 2),
  item3 = c(1, 4),
  item4 = c(3, 4),
  item5 = c(1, NA)
)

# Reshaped to long format
df_test2 <- 
  df_test1 %>%
  tidyr::pivot_longer(cols = item1:item5, names_to = "item", values_to = "value") %>%
  dplyr::mutate(item = as.factor(item)) %>%
  dplyr::mutate(value = factor(as.character(value), levels = c("0", "1", "2", "3", "4")))

What I tried:

sd1 <- SharedData$new(df_test1, key = ~id)

bscols(
  ggplotly(
    sd1$origData() %>%    # should be sd1, but returns error
      # reshaping
      tidyr::pivot_longer(cols = item1:item5, names_to = "item", values_to = "value") %>%
      dplyr::mutate(item = as.factor(item)) %>%
      dplyr::mutate(value = factor(as.character(value), levels = c("0", "1", "2", "3", "4"))) %>%
      # ploting
      ggplot(., aes(x = value, y = item, group = id)) + 
      geom_path() + 
      geom_point(aes(color = value), size = 3) + 
      scale_x_discrete(position = "top", limits = c("0", "1", "2", "3", "4")), 
  tooltip = c("x", "y", "group"), height = 600, width = 300),     
  datatable(sd1)
)  

Of course, this gives an output only because I used sd1$origData() instead of sd1 that was needed for the crosstalk functionality. Using sd1 would have thrown an error as tidyverse verbs don't work with R6 crosstalk objects. Anyway, this gives the desired output of graph and table but without the crosstalk functionality.

What I hope to obtain:

sd2 <- SharedData$new(df_test2, key = ~id)

bscols(
  ggplotly(
    # ploting
    ggplot(sd2, aes(x = value, y = item, group = id)) + 
    geom_path() + 
    geom_point(aes(color = value), size = 3) + 
    scale_x_discrete(position = "top", limits = c("0", "1", "2", "3", "4")),     
  tooltip = c("x", "y", "group"), height = 600, width = 300),    
  datatable(sd2)
) 

This works ok for a minimal example of the crosstalk functionality I want, but I need the data in DT::datatable to be in wide format. As in the example points and paths (marks and traces) need to be linked to id, which should be unique for every row in a wide format. Also, I am hoping I find a solution in which all points and paths will be invisible before users click on the desired table row.
I am guessing I am going the wrong way about this and would probably need to do something I didn't think about. I read that now, in 2021, plotly API can use wide data formatting but haven't found any examples of how this would be achieved in R.

Any help will be greatly appreciated.

like image 750
Claudiu Papasteri Avatar asked Feb 04 '21 15:02

Claudiu Papasteri


1 Answers

The trick is to create two SharedData objects, one in wide and one in long format, and connect them by giving both objects the same group name in the group argument of SharedData$new(). See below.

library(dplyr)
library(tidyr)
library(crosstalk)
library(plotly)
library(DT)

# Wide format
df_test1 <- data.frame(
  id = c("id1", "id2"),
  item1 = c(0, 4),
  item2 = c(3, 2),
  item3 = c(1, 4),
  item4 = c(3, 4),
  item5 = c(1, NA)
)

# Reshaped to long format
df_test2 <- 
  df_test1 %>%
  tidyr::pivot_longer(cols = item1:item5, names_to = "item", values_to = "value") %>%
  dplyr::mutate(item = as.factor(item)) %>%
  dplyr::mutate(value = factor(as.character(value), levels = c("0", "1", "2", "3", "4")))

# two shared data objects. Note the group argument. 
# this argument can be any string, as long as it is the same in both 
# datasets:
sd1 <- SharedData$new(df_test1, key = ~id, group = "groupdata")
sd2 <- SharedData$new(df_test2, key = ~id, group = "groupdata")

bscols(
  ggplotly(
      ggplot(sd2, aes(x = value, y = item, group = id)) + 
      geom_path() + 
      geom_point(aes(color = value), size = 3) + 
      scale_x_discrete(position = "top", limits = c("0", "1", "2", "3", "4")), 
    tooltip = c("x", "y", "group"), height = 600, width = 300),     
  datatable(sd1)
)  
like image 193
Leon Samson Avatar answered Jan 02 '23 12:01

Leon Samson