I'm still trying to get my head around using loops to plot in R. I would like to plot (any plot to visualise the data will do) columns z_1 against z_2 in the data frame below according to the different names in column x_1.
x_1 <- c("A1", "A1","A1", "B10", "B10", "B10","B10", "C100", "C100", "C100")
z_1 <- rnorm(10, 70)
z_2 <- rnorm(10, 1.7)
A <- data.frame(x_1, z_1, z_2)
As such, I would like to end up with three different plots; one for category A1, one for B10 and another for C100. I can do this using three different codes but I would like to be able to use a loop or any other single code to execute all three plots on the same page. In reality, I have a large dataset (4,000 rows) and would like to plot a couple of IDs on a page (say 5 on a page).
I hope this makes sense. Thanks for your help.
Here's my attempt at plotting them individually:
for A1:
data_A1 <- A[which(A$x_1 == "A1"), ]
plot(data_A1$z_2, data_A1$z_1)
I also tried something like this but getting error messages
for ( i in A$x_1[[i]]){
plot(A[which(A$x_1==A$x_1[[i]]), ], aspect = 1)
}
It is straightforward to combine plots in base R with mfrow and mfcol graphical parameters. You just need to specify a vector with the number of rows and the number of columns you want to create.
ggplot does not work if it is inside a for loop although it works outside of it [duplicate] Bookmark this question.
A simple approach with loops would be
for (cat in unique(x_1)){
d <- subset(A, x_1 == cat)
plot(d$z_1, d$z_2)
}
unique(x_1)
gets you all the unique values of x_1
. Then, for each of these values get a corresponding subset and use this subset for plotting.
Just to understand why your original code didn't work:
Setting up data works fine
x_1 <- c("A1", "A1", "A1", "B10", "B10", "B10","B10", "C100", "C100", "C100")
z_1 <- rnorm(10, 70)
z_2 <- rnorm(10, 1.7)
A <- data.frame(x_1, z_1, z_2)
The individual plot works fine, but as I said in a comment, the which
is unnecessary
data_A1 <- A[which(A$x_1 == "A1"), ] # your way
plot(data_A1$z_2, data_A1$z_1)
data_A1 <- A[A$x_1 == "A1", ] # deleting which() makes it cleaner
with(data_A1, plot(z_2, z_1)) # you can also use with() to save typing
Now the for loop. Let's review a simple for loop in R (pretty close to the example in ?"for"
):
for (i in 1:5) {
print(1:i)
}
Pretty straightforward, 1:5
is c(1, 2, 3, 4, 5)
, so first i
is 1
, then 2
, etc. Your for loop has a problem in that first line:
for (i in A$x_1[[i]]) { ## already a problem
First i
is A$x_1[[i]]
? That won't work, i
isn't defined yet. Also, A$x_1
is a vector, not a list, so you shouldn't be using [[
to subset it. But we don't want a subset yet, we want a vector of the values i
should take. What we want in this case is for (i in c("A1", "B10", "C100"))
, but we also want to do it programmatically instead of typing out all the different possibilities. There's a couple common ways to get that:
unique(A$x_1) # as in Mark's solution
levels(A$x_1) # works because A$x_1 is a factor
We can put either of those expressions after the in
. I changed your [[
to [
in the plot call. [[
is for lists only. I also took out the unnecessary which()
for (i in unique(A$x_1)) { # this line is good
plot(A[A$x_1==A$x_1[i], ], aspect = 1) # still a problem
}
Let's remind ourselves what values i
is taking: "A1"
, "B10"
, "C100"
. What's A$x_1 == A$x_1["A1"]
going to give? Nothing useful.
for (i in unique(A$x_1)) {
plot(A[A$x_1 == i, ], aspect = 1) # getting there
}
The above code plots something, and it's neat, but it's not what you want. There's a bunch of warnings, all of them telling us that aspect
isn't a valid argument, so we'll delete it. Looking at the plot, you'll see that it's plotting 3 variables, because we haven't told it what to put on the x and y axes.
for (i in unique(A$x_1)) {
plot(A[A$x_1==i, "z_2"], A[A$x_1==i, "z_1"]) # z_2 on x, z_1 on y
} # Works!!!
Notice that this is almost identical to Mark's answer. You don't have to use i
and j
in for loops, he used cat
. It's good practice to use a more descriptive name.
Now let's fancy it up a little:
for (i in unique(A$x_1)) {
plot(A[A$x_1==i, "z_2"], A[A$x_1==i, "z_1"],
xlim = range(A$z_2), ylim = range(A$z_1), # base the axes on full data range
main = paste("Plot of", i)) # Give each a title
}
Next time: don't forget that you can run tiny pieces of code to see what they are. If you have a line like for (i in A$x_1[[i]])
that you're not sure if it's right, enter A$x_1[[i]]
at the console, hopefully that will help you figure out that you haven't defined i
, so you'll change it to
for (i in A$x_1)
then you run A$x_1
and realize it's length is 10. You want 3 graphs, not 10, so you need i
to take 3 values, all of them different, etc.
Perhaps you don´t need a loop. Try using ggplots facet_grid(). Here is the documentation, full of examples.
library(ggplot2)
library(reshape2)
melted_a <- melt(A)
ggplot(melted_a, aes(variable, value)) +
geom_jitter() +
facet_grid(. ~ x_1)
ggplot(melted_a, aes(variable, value)) +
geom_jitter() +
facet_grid(variable ~ x_1)
Edit
Perhaps this solves this problem. But if you need to do many plots that have a similar structure, you could make a function and use aes_string()
instead of aes()
.
Note: I'm not an expert at writing functions, so probably someone could edit and improve it. (not tested)
ggplot_fun <- function(data, x, y, rowfacet, colfacet, ...){
p <- ggplot(data, aes_string(x, y))
p <- p + geom_jitter()
p <- p + facet_grid(as.formula(sprintf("%s ~ %s", rowfacet, colfacet))
}
ggplot_fun(melted_a, variable, value, variable, x_1)
Idea taken from this question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With