Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Principal Components Analysis - how to get the contribution (%) of each parameter to a Prin.Comp.?

I want to know to what degree a measurement/parameter contributes to one of the calculated principal components.

A real-world description:

  1. i've got five climatic parameters to the geographic distribution of a species
  2. i performed a PCA with these five parameters
  3. the plot of the PC1 vs. PC2 shows an interesting pattern

Question: How do I get the percentage of contribution (of each parameter) to each PC?

What I expect: PC1 is composed to 30% of parameter1, to 50% of parameter2, to 20% of parameter3, 0% of parameter4 and 0% of parameter5. PC2 is composed...

An example with 5 dummy-parameters:

a <- rnorm(10, 50, 20) b <- seq(10, 100, 10) c <- seq(88, 10, -8) d <- rep(seq(3, 16, 3), 2) e <- rnorm(10, 61, 27)  my_table <- data.frame(a, b, c, d, e)  pca <- princomp(my_table, cor=T)  biplot(pca) # same: plot(pca$scores[,1], pca$scores[,2])  pca summary(pca) 

Where is my information hidden?

like image 235
Chrugel Avatar asked Oct 06 '12 13:10

Chrugel


People also ask

How are PCA contributions calculated?

Total contribution of a variable is calculated as per ((Cx * Ex) + (Cy * Ey))/(Ex + Ey) , where: Cx and Cy are the contributions of a variable to principal components x and y. Ex and Ey are the eigenvalues of principal components x and y.

What is contribution in PCA?

The contribution is a scaled version of the squared correlation between variables and component axes (or the cosine, from a geometrical point of view) --- this is used to assess the quality of the representation of the variables of the principal component, and it is computed as cos(variable,axis)2×100 / total cos2 of ...

How do you evaluate PCA results?

The VFs values which are greater than 0.75 (> 0.75) is considered as “strong”, the values range from 0.50-0.75 (0.50 ≥ factor loading ≥ 0.75) is considered as “moderate”, and the values range from 0.30-0.49 (0.30 ≥ factor loading ≥ 0.49) is considered as “weak” factor loadings.


1 Answers

You want the $loadings component of the returned object:

R> class(pca$loadings) [1] "loadings" R> pca$loadings  Loadings:   Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 a -0.198  0.713        -0.671        b  0.600         0.334 -0.170  0.707 c -0.600        -0.334  0.170  0.707 d  0.439        -0.880 -0.180        e  0.221  0.701         0.678                        Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 SS loadings       1.0    1.0    1.0    1.0    1.0 Proportion Var    0.2    0.2    0.2    0.2    0.2 Cumulative Var    0.2    0.4    0.6    0.8    1.0 

Note that this has a special print() method which suppresses printing of small loadings.

If you want this as a relative contribution then sum up the loadings per column and express each loading as a proportion of the column (loading) sum, taking care to use the absolute values to account for negative loadings.

R> load <- with(pca, unclass(loadings)) R> load       Comp.1       Comp.2      Comp.3     Comp.4        Comp.5 a -0.1980087  0.712680378  0.04606100 -0.6713848  0.000000e+00 b  0.5997346 -0.014945831  0.33353047 -0.1698602  7.071068e-01 c -0.5997346  0.014945831 -0.33353047  0.1698602  7.071068e-01 d  0.4389388  0.009625746 -0.88032515 -0.1796321  5.273559e-16 e  0.2208215  0.701104321 -0.02051507  0.6776944 -1.110223e-16 

This final step then yields the proportional contribution to the each principal component

R> aload <- abs(load) ## save absolute values R> sweep(aload, 2, colSums(aload), "/")       Comp.1      Comp.2     Comp.3     Comp.4       Comp.5 a 0.09624979 0.490386943 0.02853908 0.35933068 0.000000e+00 b 0.29152414 0.010284050 0.20665322 0.09091055 5.000000e-01 c 0.29152414 0.010284050 0.20665322 0.09091055 5.000000e-01 d 0.21336314 0.006623362 0.54544349 0.09614059 3.728970e-16 e 0.10733880 0.482421595 0.01271100 0.36270762 7.850462e-17  R> colSums(sweep(aload, 2, colSums(aload), "/")) Comp.1 Comp.2 Comp.3 Comp.4 Comp.5       1      1      1      1      1 

If using the preferred prcomp() then the relevant loadings are in the $rotation component:

R> pca2 <- prcomp(my_table, scale = TRUE) R> pca2$rotation          PC1          PC2         PC3        PC4           PC5 a -0.1980087  0.712680378 -0.04606100 -0.6713848  0.000000e+00 b  0.5997346 -0.014945831 -0.33353047 -0.1698602 -7.071068e-01 c -0.5997346  0.014945831  0.33353047  0.1698602 -7.071068e-01 d  0.4389388  0.009625746  0.88032515 -0.1796321 -3.386180e-15 e  0.2208215  0.701104321  0.02051507  0.6776944  5.551115e-17 

And the relevant incantation is now:

R> aload <- abs(pca2$rotation) R> sweep(aload, 2, colSums(aload), "/")          PC1         PC2        PC3        PC4          PC5 a 0.09624979 0.490386943 0.02853908 0.35933068 0.000000e+00 b 0.29152414 0.010284050 0.20665322 0.09091055 5.000000e-01 c 0.29152414 0.010284050 0.20665322 0.09091055 5.000000e-01 d 0.21336314 0.006623362 0.54544349 0.09614059 2.394391e-15 e 0.10733880 0.482421595 0.01271100 0.36270762 3.925231e-17 
like image 167
Gavin Simpson Avatar answered Sep 20 '22 11:09

Gavin Simpson