Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting data frame columns to plot in ggplot2

Tags:

r

ggplot2

I have a big table of data with ~150 columns. I need to make a series of histograms out of about 1/3rd of them. Rather than putting 50 lines of the same plot command in my script, I want to loop over a list telling me which columns to use. Here is a test dataset to illustrate:

d <- data.frame(c(rep("A",5), rep("B",5)),
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE))

colnames(d) <- c("col1","col2","col3","col4","col5","col6" )


ggplot(data=d, aes(col2, fill= col1)) + geom_density(alpha = 0.5)

So, rather than writing this a 50 times and replacing the aes() values, I really want to do something more like this...

cols_to_plot <- c("col2","col4","col6")

for (i in length(cols_to_plot)) {
  ggplot(data=d, aes(cols_to_plot[i], fill= col1)) + geom_density(alpha = 0.5)

} 

But of course, this doesn't work... Is there a way to do this kind of thing?

Thanks!

like image 619
caddymob Avatar asked Nov 06 '12 22:11

caddymob


2 Answers

I think you'd be better off if you melted your data. Try this:

library(reshape2)
d2 <- melt(d, id='col1')
ggplot(d2, aes(value, fill=col1)) + geom_density(alpha=.5) + facet_wrap(~variable)

Or, if you wanted to do what you originally wanted, use aes_string, like:

ggplot(data=d, aes_string(cols_to_plot[i], fill='col1')) + geom_density(alpha = 0.5)
like image 72
Harlan Avatar answered Oct 28 '22 12:10

Harlan


There is an alternative to aes(); aes_string(). With this you can pass in strings for the aesthetic mappings. Note you have to quote col1 here in fill = "col1". Also note that in a for() loop you need to explicitly print() a ggplot object in order for the plot to be drawn on the current device.

d <- data.frame(c(rep("A",5), rep("B",5)),
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE))

colnames(d) <- c("col1","col2","col3","col4","col5","col6" )

cols_to_plot <- c("col2","col4","col6")

for (i in seq_along(cols_to_plot)) {
  print(ggplot(data=d, aes_string(x = cols_to_plot[i], fill= "col1")) +
    geom_density(alpha = 0.5))
}
like image 24
Gavin Simpson Avatar answered Oct 28 '22 12:10

Gavin Simpson