Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr + magrittr + qplot = no plot?

I want to use qplot (ggplot2) and then forward the data with magrittr:

This works:

mtcars %>% qplot(mpg, cyl, data=.)

This produces an error:

mtcars %>% qplot(mpg, cyl, data=.) %>% summarise(mean(mpg))

And those produce only summary statistics:

mtcars %T>% qplot(mpg, cyl, data=.) %>% summarise(mean(mpg))
mtcars %>% {qplot(mpg, cyl, data=.); .} %>% summarise(mean(mpg))
mtcars %T>% {qplot(mpg, cyl, data=.)} %>% summarise(mean(mpg))

What is the problem? I already found this solution, but it does not help, as you see from the code attached.

like image 924
Tim Avatar asked Feb 12 '23 10:02

Tim


2 Answers

All ggplot2 functions return an object that represents a plot - to see it you need to print it. That normally happens automatically when you're working in the console, but needs to explicit inside a function or a chain.

The most elegant solution I could come up with is this:

library("ggplot2")
library("magrittr")
library("dplyr")

echo <- function(x) {
  print(x)
  x
}
mtcars %>% 
  {echo(qplot(mpg, cyl, data = .))} %>% 
  summarise(mean(mpg))

It seems like there should be a better way.

like image 58
hadley Avatar answered Feb 19 '23 18:02

hadley


This seems more clean to me, because it does not require using %T>% (which IMHO makes a pipe harder to re-arrange and read) and no {} around the expression to avoid passing the object there. I'm not sure how much harm there is in passing the object and ignoring it.

I've never had a use for the %T>% tee where I didn't also want to print or plot. And I never wanted to print/plot the object being piped itself (usually a big dataset). So I never use %T>%.

library("ggplot2")
library("dplyr")


pap = function(pass, to_print = NULL, side_effect = NULL) {
  if( !is.null(to_print)) {
    if (is.function(to_print)) {
      print(to_print(pass))
    } else {
      print(to_print)
    }
  }
  side_effect
  invisible(pass)
}

mtcars  %>% 
   pap(summary) %>% 
   pap(side_effect = plot(.)) %>% 
   pap(qplot(mpg, cyl, data = .)) %>% 
   summarise(mean(mpg))

I usually don't use plotting as side-effect in my pipes, so the solution above works best for me (requires "extra typing" for side-effect plot). I'd like to be able to disambiguate between these intended scenarios (e.g. plot vs. qplot) automatically, but haven't found a reliable way.

like image 23
Ruben Avatar answered Feb 19 '23 19:02

Ruben