Logo Questions Linux Laravel Mysql Ubuntu Git Menu

plotting a line graph with values from the same column




I have a melted data frame with your standard id, variable and value columns. variable has 4 levels.

I want to use ggplot to plot a scatter plot using the values in value from each of the factors

to illustrate

data.frame(id= gl(4,1,labels=paste("id",1:4,sep="")), variable=gl(4,4,labels=LETTERS[1:4]),value=rnorm(16))

        id variable        value
1  id1        A -0.494270766
2  id2        A  0.189400188
3  id3        A -0.550961030
4  id4        A -1.046945450
5  id1        B -0.525552660
6  id2        B -0.293601677
7  id3        B  0.009664513
8  id4        B -0.214687215
9  id1        C  1.253551926
10 id2        C -1.241847326
11 id3        C -0.307036508
12 id4        C -0.228632605
13 id1        D -1.683798512
14 id2        D -0.419295267
15 id3        D -0.154469178
16 id4        D -0.763460558

I want to produce ggplot scatter plots for each pair of variable A vs B, A vs C, A vs D, B vs C, and so on, and then ass smoothers to them afterwards.

Cheers, Davy

like image 758
Davy Kavanagh Avatar asked Dec 27 '22 04:12

Davy Kavanagh

2 Answers

Here's a slightly modified version of plotmatrix in ggplot2 that does this:

dat <- data.frame(id= gl(4,1,labels=paste("id",1:4,sep="")), variable=gl(4,4,labels=LETTERS[1:4]),value=rnorm(16))

dat <- dcast(dat,id~variable)

plotmatrix <- function (data, mapping = aes(), colour = "black") 
    grid <- expand.grid(x = 1:ncol(data), y = 1:ncol(data))
    grid <- subset(grid, x != y)
    all <- do.call("rbind", lapply(1:nrow(grid), function(i) {
        xcol <- grid[i, "x"]
        ycol <- grid[i, "y"]
        data.frame(xvar = names(data)[ycol], yvar = names(data)[xcol], 
            x = data[, xcol], y = data[, ycol], data)
    all$xvar <- factor(all$xvar, levels = names(data))
    all$yvar <- factor(all$yvar, levels = names(data))
    densities <- do.call("rbind", lapply(1:ncol(data), function(i) {
        data.frame(xvar = names(data)[i], yvar = names(data)[i], 
            x = data[, i])
    densities$xvar <- factor(densities$xvar, levels = names(data))
    densities$yvar <- factor(densities$yvar, levels = names(data))
    mapping <- defaults(mapping, aes_string(x = "x", y = "y"))
    class(mapping) <- "uneval"
    ggplot(all, mapping) + 
        facet_grid(xvar ~ yvar, scales = "free") + 
        geom_point(colour = colour, na.rm = TRUE) + 
        stat_density(aes(x = x,y = ..scaled.. * diff(range(x)) + min(x)), 
            data = densities,position = "identity", colour = "grey20", geom = "line") + 
        geom_smooth(se = FALSE,method = "lm",colour = "blue")


enter image description here

like image 56
joran Avatar answered Jan 14 '23 23:01


Following @Dason's suggestion to try the GGally package and using @baptise's reshaping code...

    n <- 100   # number of observations
    i <- 4     # number of variables, cannot exceed 26 since letters are used as labels
    # create data, following @Davy
    d <- data.frame(id= gl(n, 1, labels, paste("id", 1:n,sep="")), 
                    variable=gl(i, n, labels=LETTERS[1:i]),value=rnorm(n*i))
    # reshape for plotting, from @baptise
    group <- unique(d$variable)
    m <- dcast(d, ...~variable, subset=.(variable %in% group))
    # make scatterplot matrix using GGally package
    # as suggested by @Dason
           lower = list(continuous = "smooth"),
    # done!

The result is a bit busy with grid lines in the boxes above the diagonal (but no doubt they can turned off) and some other finishing touches are needed before this could go prime-time.

enter image description here

But it's generally true to the ggplot2 approach (the smoother can be removed, if required). The GGally code is available on github.

It's also worth noting that there are examples (including code) of a fantastic variety of scatterplot matrices that can be done in R at Romain François' R Graph Gallery. This one is quite similar to the one above.

like image 28
Ben Avatar answered Jan 14 '23 23:01
