Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing ggplot objects in a list from within loop in R

Tags:

plot

r

ggplot2

My problem is similar to this one; when I generate plot objects (in this case histograms) in a loop, seems that all of them become overwritten by the most recent plot.

To debug, within the loop, I am printing the index and the generated plot, both of which appear correctly. But when I look at the plots stored in the list, they are all identical except for the label.

(I'm using multiplot to make a composite image, but you get same outcome if you print (myplots[[1]]) through print(myplots[[4]]) one at a time.)

Because I already have an attached dataframe (unlike the poster of the similar problem), I am not sure how to solve the problem.

(btw, column classes are factor in the original dataset I am approximating here, but same problem occurs if they are integer)

Here is a reproducible example:

library(ggplot2) source("http://peterhaschke.com/Code/multiplot.R") #load multiplot function  #make sample data col1 <- c(2, 4, 1, 2, 5, 1, 2, 0, 1, 4, 4, 3, 5, 2, 4, 3, 3, 6, 5, 3, 6, 4, 3, 4, 4, 3, 4,            2, 4, 3, 3, 5, 3, 5, 5, 0, 0, 3, 3, 6, 5, 4, 4, 1, 3, 3, 2, 0, 5, 3, 6, 6, 2, 3,            3, 1, 5, 3, 4, 6) col2 <- c(2, 4, 4, 0, 4, 4, 4, 4, 1, 4, 4, 3, 5, 0, 4, 5, 3, 6, 5, 3, 6, 4, 4, 2, 4, 4, 4,            1, 1, 2, 2, 3, 3, 5, 0, 3, 4, 2, 4, 5, 5, 4, 4, 2, 3, 5, 2, 6, 5, 2, 4, 6, 3, 3,            3, 1, 4, 3, 5, 4) col3 <- c(2, 5, 4, 1, 4, 2, 3, 0, 1, 3, 4, 2, 5, 1, 4, 3, 4, 6, 3, 4, 6, 4, 1, 3, 5, 4, 3,            2, 1, 3, 2, 2, 2, 4, 0, 1, 4, 4, 3, 5, 3, 2, 5, 2, 3, 3, 4, 2, 4, 2, 4, 5, 1, 3,            3, 3, 4, 3, 5, 4) col4 <- c(2, 5, 2, 1, 4, 1, 3, 4, 1, 3, 5, 2, 4, 3, 5, 3, 4, 6, 3, 4, 6, 4, 3, 2, 5, 5, 4,           2, 3, 2, 2, 3, 3, 4, 0, 1, 4, 3, 3, 5, 4, 4, 4, 3, 3, 5, 4, 3, 5, 3, 6, 6, 4, 2,            3, 3, 4, 4, 4, 6) data2 <- data.frame(col1,col2,col3,col4) data2[,1:4] <- lapply(data2[,1:4], as.factor) colnames(data2)<- c("A","B","C", "D")  #generate plots myplots <- list()  # new empty list for (i in 1:4) {   p1 <- ggplot(data=data.frame(data2),aes(x=data2[ ,i]))+      geom_histogram(fill="lightgreen") +     xlab(colnames(data2)[ i])   print(i)   print(p1)   myplots[[i]] <- p1  # add each plot into plot list } multiplot(plotlist = myplots, cols = 4) 

When I look at a summary of a plot object in the plot list, this is what I see

> summary(myplots[[1]]) data: A, B, C, D [60x4] mapping:  x = data2[, i] faceting: facet_null()  ----------------------------------- geom_histogram: fill = lightgreen  stat_bin:   position_stack: (width = NULL, height = NULL) 

I think that mapping: x = data2[, i] is the problem, but I am stumped! I can't post images, so you'll need to run my example and look at the graphs if my explanation of the problem is confusing.

Thanks!

like image 412
LizPS Avatar asked Aug 13 '15 16:08

LizPS


People also ask

How do I store multiple plots in R?

To save multiple plots to the same page in the PDF file, we use the par() function to create a grid and then add plots to the grid. In this way, all the plots are saved on the same page of the pdf file. We use the mfrow argument to the par() function to create the desired grid.

What does %>% do in Ggplot?

%>% is a pipe operator reexported from the magrittr package. Start by reading the vignette. Adding things to a ggplot changes the object that gets created. The print method of ggplot draws an appropriate plot depending upon the contents of the variable.

Can you filter within Ggplot?

ggplot2 allows you to do data manipulation, such as filtering or slicing, within the data argument.

Which operator allows you to add objects to a Ggplot?

The + operator updates the elements of e1 that differ from elements specified (not NULL) in e2. Thus this operator can be used to incrementally add or modify attributes of a ggplot theme.


1 Answers

In addition to the other excellent answer, here’s a solution that uses “normal”-looking evaluation rather than eval. Since for loops have no separate variable scope (i.e. they are performed in the current environment) we need to use local to wrap the for block; in addition, we need to make i a local variable — which we can do by re-assigning it to its own name1:

myplots <- vector('list', ncol(data2))  for (i in seq_along(data2)) {     message(i)     myplots[[i]] <- local({         i <- i         p1 <- ggplot(data2, aes(x = data2[[i]])) +             geom_histogram(fill = "lightgreen") +             xlab(colnames(data2)[i])         print(p1)     }) } 

However, an altogether cleaner way is to forego the for loop entirely and use list functions to build the result. This works in several possible ways. The following is the easiest in my opinion:

plot_data_column = function (data, column) {     ggplot(data, aes_string(x = column)) +         geom_histogram(fill = "lightgreen") +         xlab(column) }  myplots <- lapply(colnames(data2), plot_data_column, data = data2) 

This has several advantages: it’s simpler, and it won’t clutter the environment (with the loop variable i).


1 This might seem confusing: why does i <- i have any effect at all? — Because by performing the assignment we create a new, local variable with the same name as the variable in the outer scope. We could equally have used a different name, e.g. local_i <- i.

like image 119
Konrad Rudolph Avatar answered Oct 01 '22 21:10

Konrad Rudolph