Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Similar xlims and ylims succinctly in ggplot2

Tags:

r

ggplot2

I'm using the following code to generate biplot as given below.

library(ggfortify)
df <- iris[c(1, 2, 3, 4)]
autoplot(prcomp(df)) +
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 0)

gg

I wonder how to get similar xlims and ylims succinctly so that all four quadrants are exactly of same size.

Edited

library(ggfortify)
df <- iris[c(1, 2, 3, 4)]

autoplot(prcomp(df), data = iris, colour = 'Species',
         loadings = TRUE, loadings.colour = 'blue',
         loadings.label = TRUE, loadings.label.size = 3) +
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 0) 

enter image description here

like image 684
MYaseen208 Avatar asked Mar 19 '20 13:03

MYaseen208


1 Answers

Note my comment about scaling your data before performing PCA . Now, biplots can also be actually scaled in multiple ways.

To your question. I think the easiest approach would be to pull the maximum x/y coordinates for your individuals from the PCA object - and use them as limits. This is for using actual PCA values!. The scaled version depends on how you scale it. See below for one method.

Option 1 with the actual PCA values

library(ggplot2)
library(ggfortify)

df <- iris[1:4]

res.pca <- prcomp(df, scale. = TRUE)

cmax <- res.pca$x[which.max(res.pca$x)] #get variable coordinates

autoplot(res.pca, data = iris, colour = 'Species',
         loadings = TRUE, loadings.colour = 'blue',
         loadings.label = TRUE, loadings.label.size = 3, 
         scale = FALSE) + # scale = FALSE!
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 0) +
  coord_equal(xlim = c(-cmax,cmax), ylim = c(-cmax,cmax)) 

# also using coord_equal, so that it looks equal

Created on 2020-03-24 by the reprex package (v0.3.0)

Option 2 - One different way of scaling This thread shows how (one way of) scaling is done under the hood.

From this, you can obtain the maximum limits for the scaled biplot.

library(ggplot2)
library(ggfortify)

df <- iris[1:4]

res.pca <- prcomp(df, scale. = TRUE)

choices <- 1L:2L
scale <- 1
pc.biplot <- FALSE
scores <- res.pca$x
lam <- res.pca$sdev[choices]
n <- NROW(scores)
lam <- lam * sqrt(n)
lam <- lam^scale
bi_vec <- t(t(res.pca$rotation[, choices]) * lam)
bi_ind <- t(t(scores[, choices]) / lam)

cmax <- bi_ind[which.max(bi_ind)]

autoplot(res.pca, data = iris, colour = 'Species',
         loadings = TRUE, loadings.colour = 'blue',
         loadings.label = TRUE, loadings.label.size = 3) +
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 0) +
  coord_equal(xlim = c(-cmax,cmax), ylim = c(-cmax,cmax)) 

Created on 2020-03-24 by the reprex package (v0.3.0)

like image 149
tjebo Avatar answered Nov 19 '22 16:11

tjebo