I am working on Principal Component Analysis (PCA).
I found ggfortify
works great but would like to do some manual adjustments.
Here then trying to plot the PCA results as below:
evec <- read.table(textConnection("
PC1 PC2 PC3
-0.5708394 -0.6158420 -0.5430295
-0.6210178 -0.1087985 0.7762086
-0.5371026 0.7803214 -0.3203424"
), header = TRUE, row.names = c("M1", "M2", "M3"))
res.ct <- read.table(textConnection("
PC1 PC2 PC3
-1.762697 -1.3404825 -0.3098503
-2.349978 -0.0531175 0.6890453
-1.074205 1.5606429 -0.6406848
2.887080 -0.7272039 -0.3687029
2.299799 0.5601610 0.6301927"
), header = TRUE, row.names = c("A", "B", "C", "D", "E"))
require(ggplot2)
require(dplyr)
gpobj <-
res.ct %>%
ggplot(mapping = aes(x=PC1, y=PC2)) +
geom_point(color="grey30") +
annotate(geom="text", x=res.ct$PC1*1.07, y=res.ct$PC2*1.07,
label=rownames(res.ct))
for (i in 1:nrow(evec))
{
PCx <- evec[i,1]
PCy <- evec[i,2]
axisname <- rownames(evec)[[i]]
gpobj <- gpobj +
geom_segment(
data = evec[i,],
aes(
x = 0, y = 0,
xend = PC1, yend = PC2
# xend = PCx, yend = PCy #not work as intended
),
arrow = arrow(length = unit(4, "mm")),
color = "red"
) +
annotate(
geom = "text",
x = PCx * 1.15, y = PCy * 1.15,
label = axisname,
color = "red"
)
}
gpobj
The code works well but when I tried to use the commented line xend = PCx, yend = PCy
instead of xend = PC1, yend = PC2
, it does not work well as I intended, it does not show the all arrows.
xend = PC1, yend = PC2
works well:
xend = PCx, yend = PCy
does not:
Question:
Why does not geom_segment()
maintain the previous arrow when the starting and ending points are specified by environment variables rather than referred by variable names from data =
?
In the code you used, when PCx
/ PCy
are specified inside the aesthetic mapping aes(...)
(as opposed to hard coding them to fixed aesthetic values outside aes(...)
, as done for the annotate
layers), the actual values are only evaluated when you plot / print the ggplot object gpobj
.
This means the values of PCx
/ PCy
are evaluated outside the for-loop. By this point, they correspond to the last values they took on, for i = 3
, and that is why only one arrow segment (actually three arrows overlaid atop one another) is visible. Moving xend = PCx, yend = PCy
outside aes(...)
should achieve the look you want.
I do wonder why you choose to use for-loops in the first place, though. Wouldn't something like the following serve the same purpose?
# convert row names to explicit columns
res.ct <- tibble::rownames_to_column(res.ct)
evec <- tibble::rownames_to_column(evec)
# plot
res.ct %>%
ggplot(mapping = aes(x=PC1, y=PC2)) +
geom_point(color="grey30") +
geom_text(aes(x = PC1 * 1.07, y = PC2 * 1.07,
label = rowname)) +
geom_segment(data = evec,
aes(x = 0, y = 0, xend = PC1, yend = PC2, group = rowname),
arrow = arrow(length = unit(4, "mm")),
color = "red") +
geom_text(data = evec,
aes(x = PC1 * 1.15, y = PC2 * 1.15, label = rowname),
colour = "red")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With