I regularly used d_ply to produce exploratory plots.
A trivial example:
require(plyr)
plot_species <- function(species_data){
p <- qplot(data=species_data,
x=Sepal.Length,
y=Sepal.Width)
print(p)
}
d_ply(.data=iris,
.variables="Species",
function(x)plot_species(x))
Which produces three separate plots, one for each species.
I would like to reproduce this behaviour using functions in dplyr.
This seems to require the reassembly of the data.frame within the function called by summarise, which is often impractical.
require(dplyr)
iris_by_species <- group_by(iris,Species)
plot_species <- function(Sepal.Length,Sepal.Width){
species_data <- data.frame(Sepal.Length,Sepal.Width)
p <- qplot(data=species_data,
x=Sepal.Length,
y=Sepal.Width)
print(p)
}
summarise(iris_by_species, plot_species(Sepal.Length,Sepal.Width))
Can parts of the data.frame be passed to the function called by summarise directly, rather than passing columns?
I believe you can work with do
for this task with the same function you used in d_ply
. It will print directly to the plotting window, but also saves the plots as a list
within the resulting data.frame
if you use a named argument (see help page, this is essentially like using dlply
). I don't fully grasp all that do
can do, but if I don't use a named argument I get an error message but the plots still print to the plotting window (in RStudio).
plot_species <- function(species_data){
p <- qplot(data=species_data,
x=Sepal.Length,
y=Sepal.Width)
print(p)
}
group_by(iris, Species) %>%
do(plot = plot_species(.))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With