My problem is similar to this one; when I generate plot objects (in this case histograms) in a loop, seems that all of them become overwritten by the most recent plot.
To debug, within the loop, I am printing the index and the generated plot, both of which appear correctly. But when I look at the plots stored in the list, they are all identical except for the label.
(I'm using multiplot to make a composite image, but you get same outcome if you print (myplots[[1]])
through print(myplots[[4]])
one at a time.)
Because I already have an attached dataframe (unlike the poster of the similar problem), I am not sure how to solve the problem.
(btw, column classes are factor in the original dataset I am approximating here, but same problem occurs if they are integer)
Here is a reproducible example:
library(ggplot2) source("http://peterhaschke.com/Code/multiplot.R") #load multiplot function #make sample data col1 <- c(2, 4, 1, 2, 5, 1, 2, 0, 1, 4, 4, 3, 5, 2, 4, 3, 3, 6, 5, 3, 6, 4, 3, 4, 4, 3, 4, 2, 4, 3, 3, 5, 3, 5, 5, 0, 0, 3, 3, 6, 5, 4, 4, 1, 3, 3, 2, 0, 5, 3, 6, 6, 2, 3, 3, 1, 5, 3, 4, 6) col2 <- c(2, 4, 4, 0, 4, 4, 4, 4, 1, 4, 4, 3, 5, 0, 4, 5, 3, 6, 5, 3, 6, 4, 4, 2, 4, 4, 4, 1, 1, 2, 2, 3, 3, 5, 0, 3, 4, 2, 4, 5, 5, 4, 4, 2, 3, 5, 2, 6, 5, 2, 4, 6, 3, 3, 3, 1, 4, 3, 5, 4) col3 <- c(2, 5, 4, 1, 4, 2, 3, 0, 1, 3, 4, 2, 5, 1, 4, 3, 4, 6, 3, 4, 6, 4, 1, 3, 5, 4, 3, 2, 1, 3, 2, 2, 2, 4, 0, 1, 4, 4, 3, 5, 3, 2, 5, 2, 3, 3, 4, 2, 4, 2, 4, 5, 1, 3, 3, 3, 4, 3, 5, 4) col4 <- c(2, 5, 2, 1, 4, 1, 3, 4, 1, 3, 5, 2, 4, 3, 5, 3, 4, 6, 3, 4, 6, 4, 3, 2, 5, 5, 4, 2, 3, 2, 2, 3, 3, 4, 0, 1, 4, 3, 3, 5, 4, 4, 4, 3, 3, 5, 4, 3, 5, 3, 6, 6, 4, 2, 3, 3, 4, 4, 4, 6) data2 <- data.frame(col1,col2,col3,col4) data2[,1:4] <- lapply(data2[,1:4], as.factor) colnames(data2)<- c("A","B","C", "D") #generate plots myplots <- list() # new empty list for (i in 1:4) { p1 <- ggplot(data=data.frame(data2),aes(x=data2[ ,i]))+ geom_histogram(fill="lightgreen") + xlab(colnames(data2)[ i]) print(i) print(p1) myplots[[i]] <- p1 # add each plot into plot list } multiplot(plotlist = myplots, cols = 4)
When I look at a summary of a plot object in the plot list, this is what I see
> summary(myplots[[1]]) data: A, B, C, D [60x4] mapping: x = data2[, i] faceting: facet_null() ----------------------------------- geom_histogram: fill = lightgreen stat_bin: position_stack: (width = NULL, height = NULL)
I think that mapping: x = data2[, i]
is the problem, but I am stumped! I can't post images, so you'll need to run my example and look at the graphs if my explanation of the problem is confusing.
Thanks!
To save multiple plots to the same page in the PDF file, we use the par() function to create a grid and then add plots to the grid. In this way, all the plots are saved on the same page of the pdf file. We use the mfrow argument to the par() function to create the desired grid.
%>% is a pipe operator reexported from the magrittr package. Start by reading the vignette. Adding things to a ggplot changes the object that gets created. The print method of ggplot draws an appropriate plot depending upon the contents of the variable.
ggplot2 allows you to do data manipulation, such as filtering or slicing, within the data argument.
The + operator updates the elements of e1 that differ from elements specified (not NULL) in e2. Thus this operator can be used to incrementally add or modify attributes of a ggplot theme.
In addition to the other excellent answer, here’s a solution that uses “normal”-looking evaluation rather than eval
. Since for
loops have no separate variable scope (i.e. they are performed in the current environment) we need to use local
to wrap the for
block; in addition, we need to make i
a local variable — which we can do by re-assigning it to its own name1:
myplots <- vector('list', ncol(data2)) for (i in seq_along(data2)) { message(i) myplots[[i]] <- local({ i <- i p1 <- ggplot(data2, aes(x = data2[[i]])) + geom_histogram(fill = "lightgreen") + xlab(colnames(data2)[i]) print(p1) }) }
However, an altogether cleaner way is to forego the for
loop entirely and use list functions to build the result. This works in several possible ways. The following is the easiest in my opinion:
plot_data_column = function (data, column) { ggplot(data, aes_string(x = column)) + geom_histogram(fill = "lightgreen") + xlab(column) } myplots <- lapply(colnames(data2), plot_data_column, data = data2)
This has several advantages: it’s simpler, and it won’t clutter the environment (with the loop variable i
).
1 This might seem confusing: why does i <- i
have any effect at all? — Because by performing the assignment we create a new, local variable with the same name as the variable in the outer scope. We could equally have used a different name, e.g. local_i <- i
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With