Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a facet in ggplot, except with different variables

Tags:

r

ggplot2

I have a data frame with 3 variables, which are all wind speeds. I want to check how well the hardware was calibrated by plotting all the variables against each other. Although there are three in this instance, it may be that there are up to 6.

This would result in 3 different graphs, where the x and y parameters keep changing. I'd really like to plot these using facets- or something with the same appearance.

Here is some sample data, in a data frame called wind:

wind <- structure(list(speed_60e = c(3.029, 3.158, 2.881, 2.305, 2.45, 
2.358, 2.325, 2.723, 2.567, 1.972, 2.044, 1.745, 2.1, 2.08, 1.914, 
2.44, 2.356, 1.564, 1.942, 1.413, 1.756, 1.513, 1.263, 1.301, 
1.403, 1.496, 1.828, 1.8, 1.841, 2.014), speed_60w = c(2.981, 
3.089, 2.848, 2.265, 2.406, 2.304, 2.286, 2.686, 2.511, 1.946, 
2.004, 1.724, 2.079, 2.058, 1.877, 2.434, 2.375, 1.562, 1.963, 
1.436, 1.743, 1.541, 1.256, 1.312, 1.402, 1.522, 1.867, 1.837, 
1.873, 2.055), speed_40 = c(2.726, 2.724, 2.429, 2.028, 1.799, 
1.863, 1.987, 2.445, 2.282, 1.938, 1.721, 1.466, 1.841, 1.919, 
1.63, 2.373, 2.22, 1.576, 1.693, 1.185, 1.274, 1.421, 1.071, 
1.163, 1.166, 1.504, 1.77, 1.778, 1.632, 1.545)), .Names = c("speed_60e", 
"speed_60w", "speed_40"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", 
"25", "26", "27", "28", "29", "30"))

R> head(wind)
  speed_60e speed_60w speed_40
1     3.029     2.981    2.726
2     3.158     3.089    2.724
3     2.881     2.848    2.429
4     2.305     2.265    2.028
5     2.450     2.406    1.799
6     2.358     2.304    1.863

I wish to plot three square graphs. An individual one can be plotted by calling

ggplot() + geom_point(data=wind, aes(wind[,1],wind[,3]), alpha=I(1/30), 
                      shape=I(20), size=I(1))

Any idea how I can do this?

like image 849
Chris Avatar asked Apr 01 '11 10:04

Chris


People also ask

What does Facet_wrap do in Ggplot?

facet_wrap() makes a long ribbon of panels (generated by any number of variables) and wraps it into 2d. This is useful if you have a single variable with many levels and want to arrange the plots in a more space efficient manner.

What are facets in Ggplot?

The facet approach partitions a plot into a matrix of panels. Each panel shows a different subset of the data. This R tutorial describes how to split a graph using ggplot2 package. There are two main functions for faceting : facet_grid()

What is the difference between Facet_wrap and Facet_grid?

The facet_grid() function will produce a grid of plots for each combination of variables that you specify, even if some plots are empty. The facet_wrap() function will only produce plots for the combinations of variables that have values, which means it won't produce any empty plots.


2 Answers

Will something like this do?

plotmatrix(data = wind) + geom_smooth(method="lm")

Which gives:

pairs plotting in ggplot

Hadley calls this a "Crude experimental scatterplot matrix", but it might suffice for your needs?

Edit: Currently, plotmatrix() isn't quite flexible enough to handle all of @Chris' requirements regarding specification of the geom_point() layer. However, we can cut the guts out of plotmatrix() as use Hadley's nice code to create the data structure needed for plotting, but plot it however we like using standard ggplot() calls. This function also drops the densities but you can look into the code for plotmatrix() to see how to get them.

First, a function that expands the data from the wide format to the repeated format required for a pairs plot where we plot each variables against every other, but not itself.

Expand <- function(data) {
    grid <- expand.grid(x = 1:ncol(data), y = 1:ncol(data))
    grid <- subset(grid, x != y)
    all <- do.call("rbind", lapply(1:nrow(grid), function(i) {
        xcol <- grid[i, "x"]
        ycol <- grid[i, "y"]
        data.frame(xvar = names(data)[ycol], yvar = names(data)[xcol], 
                   x = data[, xcol], y = data[, ycol], data)
    }))
    all$xvar <- factor(all$xvar, levels = names(data))
    all$yvar <- factor(all$yvar, levels = names(data))
    all
}

Note: all this does is steal Hadley's code from plotmatrix() - I have done nothing fancy here.

Expand the data:

wind2 <- Expand(wind)

Now we can plot this as any other long-format data object required by ggplot():

ggplot(wind2, aes(x = x, y = y)) + 
    geom_point(alpha = I(1/10), shape = I(20), size = I(1)) + 
    facet_grid(xvar ~ yvar, scales = "free")

If you want the densities, then we can pull out that bit of code two into a helper function:

makeDensities <- function(data) {
    densities <- do.call("rbind", lapply(1:ncol(data), function(i) {
        data.frame(xvar = names(data)[i], yvar = names(data)[i], 
                   x = data[, i])
    }))
    densities
}

Then compute the densities for the original data:

dens <- makeDensities(wind)

and then add then using the same bit of code from plotmatrix():

ggplot(wind2, aes(x = x, y = y)) + 
       geom_point(alpha = I(1/10), shape = I(20), size = I(1)) + 
       facet_grid(xvar ~ yvar, scales = "free")+
       stat_density(aes(x = x, y = ..scaled.. * diff(range(x)) + min(x)),
                    data = dens, position = "identity", colour = "grey20", 
                    geom = "line")

A complete version of the original figure I showed above but using the extracted code would be:

ggplot(wind2, aes(x = x, y = y)) + 
       geom_point(alpha = I(1/10), shape = I(20), size = I(1)) + 
       facet_grid(xvar ~ yvar, scales = "free")+
       stat_density(aes(x = x, y = ..scaled.. * diff(range(x)) + min(x)),
                    data = dens, position = "identity", colour = "grey20", 
                    geom = "line") +
       geom_smooth(method="lm")

giving:

custom version of the pairs plot

like image 98
Gavin Simpson Avatar answered Sep 21 '22 10:09

Gavin Simpson


Melt the data first (convert it to long form).

mwind <- melt(wind)
ggplot(mwind, aes(value)) + geom_histogram() + facet_wrap(~ variable)

If you want to plot points, you need to add an index variable for the x axis.

like image 24
Richie Cotton Avatar answered Sep 20 '22 10:09

Richie Cotton