Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different data in upper and lower panel of scatterplot matrix

Tags:

dataframe

plot

r

I want to plot two different data sets in a scatterplot matrix.

I know that I can use upper.panel and lower.panel to differentiate the plot function. However, I don’t succeed in putting my data in a suitable format to harness this.

Assume I have two tissues (“brain” and “heart”) and four conditions (1–4). Now I can use e.g. pairs(data$heart) to get a scatterplot matrix for one of the data sets. Assume I have the following data:

conditions <- 1 : 4
noise <- rnorm(100)
data <- list(brain = sapply(conditions, function (x) noise + 0.1 * rnorm(100)),
             heart = sapply(conditions, function (x) noise + 0.3 * rnorm(100)))

How do I get this into a format so that pairs(data, …) plots one data set above and one below the diagonal, as shown here (green = brain, violet = heart):

screenshot

Just using

pairs(data, upper.panel = something, lower.panel = somethingElse)

Doesn’t work because that will plot all conditions versus all conditions without regard for different tissue – it essentially ignores the list, and the same when reordering the hierarchy (i.e. having data = (A=list(brain=…, heart=…), B=list(brain=…, heart=…), …)).

like image 448
Konrad Rudolph Avatar asked Mar 25 '13 21:03

Konrad Rudolph


1 Answers

This is the best I seem to be able to do via passing arguments:

foo.upper <- function(x,y,ind.upper,col.upper,ind.lower,col.lower,...){
    points(x[ind.upper],y[ind.upper],col = col.upper,...)
}

foo.lower <- function(x,y,ind.lower,col.lower,ind.upper,col.upper,...){
    points(x[ind.lower],y[ind.lower],col = col.lower,...)
}

pairs(dat[,-5],
        lower.panel = foo.lower,
        upper.panel = foo.upper,
        ind.upper = dat$type == 'brain',
        ind.lower = dat$type == 'heart',
        col.upper = 'blue',
        col.lower = 'red')

Note that each panel needs all arguments. ... is a cruel mistress. If you include only the panel specific arguments in each function, it appears to work, but you get lots and lots of warnings from R trying to pass these arguments on to regular plotting functions and obviously they won't exist.

This was my quick first attempt, but it seems ugly:

dat <- as.data.frame(do.call(rbind,data))
dat$type <- rep(c('brain','heart'),each = 100)

foo.upper <- function(x,y,...){
    points(x[dat$type == 'brain'],y[dat$type == 'brain'],col = 'red',...)
}

foo.lower <- function(x,y,...){
    points(x[dat$type == 'heart'],y[dat$type == 'heart'],col = 'blue',...)
}

pairs(dat[,-5],lower.panel = foo.lower,upper.panel = foo.upper)

enter image description here

I'm abusing R's scoping here in this second version a somewhat ugly way. (Of course, you could probably do this more cleanly in lattice, but you probably knew that.)

The only other option I can think of is to design your own scatter plot matrix using layout, but that's probably quite a bit of work.

Lattice Edit

Here's at least a start on a lattice solution. It should handle varying x,y axis ranges better, but I haven't tested that.

dat <- do.call(rbind,data)
dat <- as.data.frame(dat)
dat$grp <- rep(letters[1:2],each = 100)

plower <- function(x,y,grp,...){
    panel.xyplot(x[grp == 'a'],y[grp == 'a'],col = 'red',...)
}

pupper <- function(x,y,grp,...){
    panel.xyplot(x[grp == 'b'],y[grp == 'b'],...)
}

splom(~dat[,1:4],
        data = dat,
        lower.panel = plower,
        upper.panel = pupper,
        grp = dat$grp)
like image 156
joran Avatar answered Sep 20 '22 10:09

joran