I am a big fan of facet_wrap
. Though it is fast to split a big data frame and plot several plots and explore within R, it's not always the best tool for presenting in paper or in slides.
I find myself wasting a lot of time with scales, binwidths and font sizes and ultimately modifying on inkscape the plot.
Sometimes I just subset my data frame into many data frames and plot individually for each one. Later join them with multiplot
or by hand.
I was wondering if there might be a way of making the ggplot call almost in the same way (one big df with a factor column used for faceting) or a way to make ggplot read from something with list-like data frame separated by my faceting factor. The ideal output should be multiple single plots which I'll edit later on inkscape (and use free_y
scales to make it less painful)
To be clear,
df<-mtcars
ggplot(df,aes(df$mpg,df$disp,color=factor(cyl)))+
geom_point(aes(df$mpg,df$disp))+
facet_wrap( ~cyl)
Produces one plot. My desired output in this case would be three plots, one for each facet.
The facet_grid() function will produce a grid of plots for each combination of variables that you specify, even if some plots are empty. The facet_wrap() function will only produce plots for the combinations of variables that have values, which means it won't produce any empty plots.
To remove the label from facet plot, we need to use “strip. text. x” argument inside the theme() layer with argument 'element_blank()'.
facet_wrap() with two variables Compute the counts for the plot so we have two variables to use in faceting: marvel_count <- count(marvel, year, align, gender) glimpse(marvel_count) ## Observations: 155 ## Variables: 4 ## $ year <dbl> 1939, 1939, 1940, 1940, 1940, 1941, 1941, 1943, 1944, 19...
facet_grid() forms a matrix of panels defined by row and column faceting variables. It is most useful when you have two discrete variables, and all combinations of the variables exist in the data. If you have only one variable with many levels, try facet_wrap() .
You can use lapply
to create a list with one plot for each value of cyl
:
# Create a separate plot for each value of cyl, and store each plot in a list
p.list = lapply(sort(unique(mtcars$cyl)), function(i) {
ggplot(mtcars[mtcars$cyl==i,], aes(mpg, disp, colour=factor(cyl))) +
geom_point(show.legend=FALSE) +
facet_wrap(~cyl) +
scale_colour_manual(values=hcl(seq(15,365,length.out=4)[match(i, sort(unique(mtcars$cyl)))], 100, 65))
})
The complicated scale_colour_manual
argument is a way to color the point markers the same way they would be colored if all the values of cyl
were included in a single call to ggplot
.
UPDATE: To address your comments, how about this:
# Fake data
set.seed(15)
dat = data.frame(group=rep(c("A","B","C"), each=100),
value=c(mapply(rnorm, 100, c(5,10,20), c(1,3,5))))
p.list = lapply(sort(unique(dat$group)), function(i) {
ggplot(dat[dat$group==i,], aes(value, fill=group)) +
geom_histogram(show.legend=FALSE, colour="grey20", binwidth=1) +
facet_wrap(~group) +
scale_fill_manual(values=hcl(seq(15,365,length.out=4)[match(i, sort(unique(dat$group)))], 100, 65)) +
scale_x_continuous(limits=range(dat$value)) +
theme_gray(base_size=15)
})
The result is below. Note that the code above gives you the same x-scale on all three graphs, but not the same y-scale. To get the same y-scale, you can either hard-code it as, say, scale_y_continuous(limits = c(0,35))
, or you can find the maximum count programmatically for whatever binwidth you set, and then feed that to scale_y_continuous
.
# Arrange all three plots together
library(gridExtra)
do.call(grid.arrange, c(p.list, nrow=3))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With