Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot - geom_segment() with mutiple arrows

Tags:

r

ggplot2

I am working on Principal Component Analysis (PCA). I found ggfortify works great but would like to do some manual adjustments.

Here then trying to plot the PCA results as below:

evec <- read.table(textConnection("
  PC1        PC2        PC3
  -0.5708394 -0.6158420 -0.5430295
  -0.6210178 -0.1087985  0.7762086
  -0.5371026  0.7803214 -0.3203424"
), header = TRUE, row.names = c("M1", "M2", "M3"))

res.ct <- read.table(textConnection("
  PC1        PC2        PC3
  -1.762697 -1.3404825 -0.3098503
  -2.349978 -0.0531175  0.6890453
  -1.074205  1.5606429 -0.6406848
  2.887080 -0.7272039 -0.3687029
  2.299799  0.5601610  0.6301927"
), header = TRUE, row.names = c("A", "B", "C", "D", "E"))

require(ggplot2)
require(dplyr)
gpobj <- 
  res.ct %>%
  ggplot(mapping = aes(x=PC1, y=PC2)) +
  geom_point(color="grey30") +
  annotate(geom="text", x=res.ct$PC1*1.07, y=res.ct$PC2*1.07,
           label=rownames(res.ct))

for (i in 1:nrow(evec))
{
  PCx <- evec[i,1]
  PCy <- evec[i,2]
  axisname <- rownames(evec)[[i]]
  gpobj <- gpobj +
    geom_segment(
      data = evec[i,],
      aes(
        x = 0, y = 0,
        xend = PC1, yend = PC2
        # xend = PCx, yend = PCy  #not work as intended
      ),
      arrow = arrow(length = unit(4, "mm")),
      color = "red"
    ) +
    annotate(
      geom = "text",
      x = PCx * 1.15, y = PCy * 1.15,
      label = axisname,
      color = "red"
    )
}
gpobj

The code works well but when I tried to use the commented line xend = PCx, yend = PCy instead of xend = PC1, yend = PC2, it does not work well as I intended, it does not show the all arrows.

xend = PC1, yend = PC2 works well:

<code>xend = PC1, yend = PC2</code> works well

xend = PCx, yend = PCy does not:

<code>xend = PCx, yend = PCy</code> does not

Question: Why does not geom_segment() maintain the previous arrow when the starting and ending points are specified by environment variables rather than referred by variable names from data =?

like image 701
Nobutag Avatar asked Dec 14 '19 00:12

Nobutag


1 Answers

In the code you used, when PCx / PCy are specified inside the aesthetic mapping aes(...) (as opposed to hard coding them to fixed aesthetic values outside aes(...), as done for the annotate layers), the actual values are only evaluated when you plot / print the ggplot object gpobj.

This means the values of PCx / PCy are evaluated outside the for-loop. By this point, they correspond to the last values they took on, for i = 3, and that is why only one arrow segment (actually three arrows overlaid atop one another) is visible. Moving xend = PCx, yend = PCy outside aes(...) should achieve the look you want.

I do wonder why you choose to use for-loops in the first place, though. Wouldn't something like the following serve the same purpose?

# convert row names to explicit columns
res.ct <- tibble::rownames_to_column(res.ct)
evec <- tibble::rownames_to_column(evec)

# plot
res.ct %>%
  ggplot(mapping = aes(x=PC1, y=PC2)) +
  geom_point(color="grey30") +
  geom_text(aes(x = PC1 * 1.07, y = PC2 * 1.07,
                label = rowname)) +
  geom_segment(data = evec,
               aes(x = 0, y = 0, xend = PC1, yend = PC2, group = rowname),
               arrow = arrow(length = unit(4, "mm")),
               color = "red") +
  geom_text(data = evec,
            aes(x = PC1 * 1.15, y = PC2 * 1.15, label = rowname),
            colour = "red")

plot

like image 93
Z.Lin Avatar answered Nov 15 '22 04:11

Z.Lin