Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can a data ellipse be superimposed on a ggplot2 scatterplot?

Tags:

I have an R function which produces 95% confidence ellipses for scatterplots. The output looks like this, having a default of 50 points for each ellipse (50 rows):

           [,1]         [,2]  [1,]  0.097733810  0.044957994  [2,]  0.084433494  0.050337990  [3,]  0.069746783  0.054891438 

I would like to superimpose a number of such ellipses for each level of a factor called 'site' on a ggplot2 scatterplot, produced from this command:

> plat1 <- ggplot(mapping=aes(shape=site, size=geom), shape=factor(site)); plat1 + geom_point(aes(x=PC1.1,y=PC2.1)) 

This is run on a dataset, called dflat which looks like this:

site      geom         PC1.1        PC2.1       PC3.1        PC1.2       PC2.2 1 Buhlen 1259.5649 -0.0387975838 -0.022889782  0.01355317  0.008705276  0.02441577 2 Buhlen  653.6607 -0.0009398704 -0.013076251  0.02898955 -0.001345149  0.03133990 

The result is fine, but when I try to add the ellipse (let's say for this one site, called "Buhlen"):

> plat1 + geom_point(aes(x=PC1.1,y=PC2.1)) + geom_path(data=subset(dflat, site="Buhlen"),mapping=aes(x=ELLI(PC1.1,PC2.1)[,1],y=ELLI(PC1.1,PC2.1)[,2])) 

I get an error message: "Error in data.frame(x = c(0.0977338099339815, 0.0844334944904515, 0.0697467834016782, : arguments imply differing number of rows: 50, 211

I've managed to fix this in the past, but I cannot remember how. It seems that geom_path is relying on the same points rather than plotting new ones. Any help would be appreciated.

like image 837
radu Avatar asked Mar 07 '10 17:03

radu


People also ask

Which function can be used to create scatter plots with ggplot2?

This article describes how create a scatter plot using R software and ggplot2 package. The function geom_point() is used.

How do you create a scatter plot using ggplot2 package?

Basic scatter plot You first pass the dataset mtcars to ggplot. Inside the aes() argument, you add the x-axis and y-axis. The + sign means you want R to keep reading the code. It makes the code more readable by breaking it.

Is Geom_point a scatter plot?

The point geom is used to create scatterplots. The scatterplot is most useful for displaying the relationship between two continuous variables.


1 Answers

Maybe this could help you:

#bootstrap set.seed(101) n <- 1000 x <- rnorm(n, mean=2) y <- 1.5 + 0.4*x + rnorm(n) df <- data.frame(x=x, y=y, group="A") x <- rnorm(n, mean=2) y <- 1.5*x + 0.4 + rnorm(n) df <- rbind(df, data.frame(x=x, y=y, group="B"))  #calculating ellipses library(ellipse) df_ell <- data.frame() for(g in levels(df$group)){ df_ell <- rbind(df_ell, cbind(as.data.frame(with(df[df$group==g,], ellipse(cor(x, y),                                           scale=c(sd(x),sd(y)),                                           centre=c(mean(x),mean(y))))),group=g)) } #drawing library(ggplot2) p <- ggplot(data=df, aes(x=x, y=y,colour=group)) + geom_point(size=1.5, alpha=.6) +   geom_path(data=df_ell, aes(x=x, y=y,colour=group), size=1, linetype=2) 

Output looks like this:

enter image description here

Here is more complex example.

like image 129
Yuriy Petrovskiy Avatar answered Sep 19 '22 14:09

Yuriy Petrovskiy