Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using dplyr for exploratory plots

Tags:

r

dplyr

plyr

I regularly used d_ply to produce exploratory plots.

A trivial example:

require(plyr)

plot_species <- function(species_data){
  p <- qplot(data=species_data,
        x=Sepal.Length,
        y=Sepal.Width)
  print(p)

}

d_ply(.data=iris,
      .variables="Species",
      function(x)plot_species(x))

Which produces three separate plots, one for each species.

I would like to reproduce this behaviour using functions in dplyr.

This seems to require the reassembly of the data.frame within the function called by summarise, which is often impractical.

require(dplyr)

iris_by_species <- group_by(iris,Species)

plot_species <- function(Sepal.Length,Sepal.Width){

  species_data <- data.frame(Sepal.Length,Sepal.Width)

  p <- qplot(data=species_data,
             x=Sepal.Length,
             y=Sepal.Width)
  print(p)

}


summarise(iris_by_species, plot_species(Sepal.Length,Sepal.Width))

Can parts of the data.frame be passed to the function called by summarise directly, rather than passing columns?

like image 534
Etienne Low-Décarie Avatar asked Mar 19 '23 09:03

Etienne Low-Décarie


1 Answers

I believe you can work with do for this task with the same function you used in d_ply. It will print directly to the plotting window, but also saves the plots as a list within the resulting data.frame if you use a named argument (see help page, this is essentially like using dlply). I don't fully grasp all that do can do, but if I don't use a named argument I get an error message but the plots still print to the plotting window (in RStudio).

plot_species <- function(species_data){
  p <- qplot(data=species_data,
        x=Sepal.Length,
        y=Sepal.Width)
  print(p)

}

group_by(iris, Species) %>%
    do(plot = plot_species(.))
like image 133
aosmith Avatar answered Mar 29 '23 05:03

aosmith