Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change points and add a regression to a cloudplot (using R)?

To make clear what I'm asking I've created an easy example. Step one is to create some data:

gender <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2),labels = c("male", "female"))
numberofdrugs <- rpois(84, 50) + 1
geneticvalue <- rpois(84,75)
death <- rpois(42,50) + 15
y <- data.frame(death, numberofdrugs, geneticvalue, gender)

So these are some random dates merged to one data.frame. So from these dates I'd like to plot a cloud where I can differ between the males and females and where I add two simple regressions (one for females and one for males). So I've started, but I couldn't get to the point where I want to be. Please see below what I've done so far:

require(lattice)
cloud(y$death~y$numberofdrugs*geneticvalue)

cloud plot in basic form

xmale <- subset(y, gender=="male")
xfemale <- subset(y, gender=="female")

death.lm.male <- lm(death~numberofdrugs+geneticvalue, data=xmale)
death.lm.female <- lm(death~numberofdrugs+geneticvalue, data=xfemale)

How can I make different points for males or females when using the cloud command (for example blue and pink points instead of just blue crosses) and how can I add the two estimated models to the cloud graph?

Any thought is appreciated! Thanks for your ideas!

like image 586
MarkDollar Avatar asked Jul 21 '11 10:07

MarkDollar


1 Answers

Answer to the first half of your question, "How can I make different points for males or females when using the cloud command (for example blue and pink points insted of just blue crosses)?"

 cloud( death ~ numberofdrugs*geneticvalue , groups=gender, data=y )

grouped cloud plot

The meta-answer to this may involve some non-3d visualization. Perhaps you can use lattice or ggplot2 to split the data into small multiples? It will likely be more comprehensible and likely easier to add the regression results.

splom( ~ data.frame( death, numberofdrugs, geneticvalue ), groups=gender, data=y )

splom

The default splom panel function is panel.pairs, and you could likely modify it to add a regression line without an enormous amount of trouble.

ggplot2 does regressions within the plot matrix easily, but I can't get the colors to work.

pm <- plotmatrix( y[ , 1:3], mapping = aes(color=death) )
pm + geom_smooth(method="lm")

plotmatrix

And finally, if you really want to do a cloudplot with a regression plane, here's a way to do it using the scatterplot3d package. Note I changed the data to have a little more interesting structure to see:

numberofdrugs <- rpois( 84, 50 ) + 1
geneticvalue <- numberofdrugs + rpois( 84, 75 )
death <- geneticvalue + rpois( 42, 50 ) + 15
y <- data.frame( death, numberofdrugs, geneticvalue, gender )

library(scatterplot3d) 
pts <- as.numeric( as.factor(y$gender) ) + 4
s <-scatterplot3d( y$death, y$numberofdrugs, y$geneticvalue, pch=pts, type="p", highlight.3d=TRUE )
fit <- lm( y$death ~ y$numberofdrugs + y$geneticvalue )
s$plane3d(fit)

scatterplot3d with regression plane

like image 54
Ari B. Friedman Avatar answered Sep 21 '22 17:09

Ari B. Friedman