I wonder if it is possible to plot pca biplot results with ggplot2. Suppose if I want to display the following biplot results with ggplot2
fit <- princomp(USArrests, cor=TRUE) summary(fit) biplot(fit)
Any help will be highly appreciated. Thanks
In summary: A PCA biplot shows both PC scores of samples (dots) and loadings of variables (vectors). The further away these vectors are from a PC origin, the more influence they have on that PC.
A biplot is constructed by using the singular value decomposition (SVD) to obtain a low-rank approximation to a transformed version of the data matrix X, whose n rows are the samples (also called the cases, or objects), and whose p columns are the variables.
Maybe this will help-- it's adapted from code I wrote some time back. It now draws arrows as well.
PCbiplot <- function(PC, x="PC1", y="PC2") { # PC being a prcomp object data <- data.frame(obsnames=row.names(PC$x), PC$x) plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames)) plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2) datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation) mult <- min( (max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))), (max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x]))) ) datapc <- transform(datapc, v1 = .7 * mult * (get(x)), v2 = .7 * mult * (get(y)) ) plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color="red") plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color="red") plot } fit <- prcomp(USArrests, scale=T) PCbiplot(fit)
You may want to change size of text, as well as transparency and colors, to taste; it would be easy to make them parameters of the function. Note: it occurred to me that this works with prcomp but your example is with princomp. You may, again, need to adapt the code accordingly. Note2: code for geom_segment()
is borrowed from the mailing list post linked from comment to OP.
Here is the simplest way through ggbiplot
:
library(ggbiplot) fit <- princomp(USArrests, cor=TRUE) biplot(fit)
ggbiplot(fit, labels = rownames(USArrests))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With