Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to create a "star" plot using ggplot?

Tags:

r

ggplot2

I'm trying to (partially) reproduce the cluster plot available throught s.class(...) in package ade4 using ggplot, but this question is actually much more general.

NB: This question refers to "star plots", but really only discusses spider plots.

df     <- mtcars[,c(1,3,4,5,6,7)]
pca    <-prcomp(df, scale.=T, retx=T)
scores <-data.frame(pca$x)

library(ade4)
km <- kmeans(df,centers=3)
plot.df <- cbind(scores$PC1, scores$PC2)
s.class(plot.df, factor(km$cluster))

The essential feature I'm looking for is the "stars", e.g. a set of lines radiating from a common point (here, the cluster centroids) to a number of other points (here, the points in the cluster).

Is there a way to do that using the ggplot package? If not directly through ggplot, then does anyone know of an add-in that works. For example, there are several variations on stat_ellipse(...) which is not part of the ggplot package (here, and here).

like image 542
jlhoward Avatar asked Dec 17 '13 00:12

jlhoward


People also ask

How do you plot a star in R?

To display characters inside a base R plot we can simply use text function with expression and if we want to display an asterisk then we need to put the asterisk within double quotes. For example, if we want to display three stars then only expression(paste("***"))) should be used.

What can I do with ggplot?

ggplot2 is a plotting package that provides helpful commands to create complex plots from data in a data frame. It provides a more programmatic interface for specifying what variables to plot, how they are displayed, and general visual properties.

What is the difference between plot and ggplot?

The base plotting paradigm is "ink on paper" whereas the lattice and ggplot paradigms are basically writing a program that uses the grid -package to accomplish the low-level output to the target graphics devices.

What does GG stand for in ggplot?

ggplot2 [library(ggplot2)] ) is a plotting library for R developed by Hadley Wickham, based on Leland Wilkinson's landmark book The Grammar of Graphics ["gg" stands for Grammar of Graphics]. Some documentation can be found on the ggplot website .


2 Answers

This answer is based on @agstudy's response and the suggestions made in @Henrik's comment. Posting because it's shorter and more directly applicable to the question.

Bottom line is this: star plots are readily made with ggplot using geom_segment(...). Using df, pca, scores, and km from the question:

# build ggplot dataframe with points (x,y) and corresponding groups (cluster)
gg <- data.frame(cluster=factor(km$cluster), x=scores$PC1, y=scores$PC2)
# calculate group centroid locations
centroids <- aggregate(cbind(x,y)~cluster,data=gg,mean)
# merge centroid locations into ggplot dataframe
gg <- merge(gg,centroids,by="cluster",suffixes=c("",".centroid"))
# generate star plot...
ggplot(gg) +
  geom_point(aes(x=x,y=y,color=cluster), size=3) +
  geom_point(data=centroids, aes(x=x, y=y, color=cluster), size=4) +
  geom_segment(aes(x=x.centroid, y=y.centroid, xend=x, yend=y, color=cluster))

Result is identical to that obtained with s.class(...).

like image 61
jlhoward Avatar answered Nov 07 '22 23:11

jlhoward


The difficulty here is to create data not the plot itself. You should go through the code of the package and extract what it is useful for you. This should be a good start :

enter image description here

dfxy <- plot.df
df <- data.frame(dfxy)
x <- df[, 1]
y <- df[, 2]

fac <- factor(km$cluster)
f1 <- function(cl) {
  n <- length(cl)
  cl <- as.factor(cl)
  x <- matrix(0, n, length(levels(cl)))
  x[(1:n) + n * (unclass(cl) - 1)] <- 1
  dimnames(x) <- list(names(cl), levels(cl))
  data.frame(x)
}
wt = rep(1, length(fac))
dfdistri <- f1(fac) * wt
w1 <- unlist(lapply(dfdistri, sum))
dfdistri <- t(t(dfdistri)/w1)

## create a data.frame
cstar=2
ll <- lapply(seq_len(ncol(dfdistri)),function(i){
  z1 <- dfdistri[,i]
  z <- z1[z1>0]
  x <- x[z1>0]
  y <- y[z1>0]
  z <- z/sum(z)
  x1 <- sum(x * z)
  y1 <- sum(y * z)
  hx <- cstar * (x - x1)
  hy <- cstar * (y - y1)
  dat <- data.frame(x=x1, y=y1, xend=x1 + hx, yend=y1 + hy,center=factor(i))
})

dat <- do.call(rbind,ll)
library(ggplot2)
ggplot(dat,aes(x=x,y=y))+
  geom_point(aes(shape=center)) +
  geom_segment(aes(yend=yend,xend=xend,color=center,group=center))
like image 25
agstudy Avatar answered Nov 08 '22 00:11

agstudy