I want to retrive the cumulative proportion of explained variance after a pca in R. summary(pca)
returns this result in its last row, but how can I extract this row?
summary(prcomp(USArrests, scale = TRUE))
Importance of components:
PC1 PC2 PC3 PC4
Standard deviation 1.5749 0.9949 0.59713 0.41645
Proportion of Variance 0.6201 0.2474 0.08914 0.04336
Cumulative Proportion 0.6201 0.8675 0.95664 1.00000
I tried s <- summary(prcomp(USArrests, scale = TRUE))
and s[3] etc, but it doesn't return the last row.
Cumulative Proportion: This is simply the accumulated amount of explained variance, ie. if we used the first 10 components we would be able to account for >95% of total variance in the data.
Explained variance is calculated as ratio of eigenvalue of a articular principal component (eigenvector) with total eigenvalues. Explained variance can be calculated as the attribute explained_variance_ratio_ of PCA instance created using sklearn. decomposition PCA class.
The cumulative explained variance shows the accumulation of variance for each principal component number. The individual explained variance describes the variance of each principal component.
On the plotted chart, we see what number of principal components we need. In this case, to get 95% of variance explained I need 9 principal components.
Expanding on user20650's answer in the question's comments, as I believe it answers the question most directly (i.e. via the object itself, rather than recalculating). TL;DR: s$importance[3, ]
.
(s <- summary(prcomp(USArrests, scale = TRUE)))
# Importance of components:
# PC1 PC2 PC3 PC4
# Standard deviation 1.5749 0.9949 0.59713 0.41645
# Proportion of Variance 0.6201 0.2474 0.08914 0.04336
# Cumulative Proportion 0.6201 0.8675 0.95664 1.00000
str(s)
# List of 6
# $ sdev : num [1:4] 1.575 0.995 0.597 0.416
# $ rotation : num [1:4, 1:4] -0.536 -0.583 -0.278 -0.543 0.418 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : chr [1:4] "Murder" "Assault" "UrbanPop" "Rape"
# .. ..$ : chr [1:4] "PC1" "PC2" "PC3" "PC4"
# $ center : Named num [1:4] 7.79 170.76 65.54 21.23
# ..- attr(*, "names")= chr [1:4] "Murder" "Assault" "UrbanPop" "Rape"
# $ scale : Named num [1:4] 4.36 83.34 14.47 9.37
# ..- attr(*, "names")= chr [1:4] "Murder" "Assault" "UrbanPop" "Rape"
# $ x : num [1:50, 1:4] -0.976 -1.931 -1.745 0.14 -2.499 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : chr [1:50] "Alabama" "Alaska" "Arizona" "Arkansas" ...
# .. ..$ : chr [1:4] "PC1" "PC2" "PC3" "PC4"
# $ importance: num [1:3, 1:4] 1.575 0.62 0.62 0.995 0.247 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : chr [1:3] "Standard deviation" "Proportion of Variance" "Cumulative Proportion"
# .. ..$ : chr [1:4] "PC1" "PC2" "PC3" "PC4"
# - attr(*, "class")= chr "summary.prcomp"
# We see importance is the relevant feature
s$importance
# PC1 PC2 PC3 PC4
# Standard deviation 1.574878 0.9948694 0.5971291 0.4164494
# Proportion of Variance 0.620060 0.2474400 0.0891400 0.0433600
# Cumulative Proportion 0.620060 0.8675000 0.9566400 1.0000000
# Cool, same as displayed the table. Grab the third row and voila.
s$importance[3, ] # Numeric vector
# PC1 PC2 PC3 PC4
# 0.62006 0.86750 0.95664 1.00000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With