Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to interpret cca vegan output

Tags:

r

vegan

I have performed a canonical correspondece analysis in R using the vegan package but i find the output very difficult to understand. The triplot is understandable, but all the numbers I get from the summary(cca) are confusing to me (as i've just started to learn about ordination techniques) I would like to know how much of the variance in Y that is explained by X (in this case, the environmental variables) and which of the independent variables that are important in this model?

my output looks like this:

Partitioning of mean squared contingency coefficient:
              Inertia Proportion
Total           4.151     1.0000
Constrained     1.705     0.4109
Unconstrained   2.445     0.5891

Eigenvalues, and their contribution to the mean squared contingency coefficient 

Importance of components:
                        CCA1   CCA2    CCA3    CCA4    CCA5    CCA6      CCA7
Eigenvalue            0.6587 0.4680 0.34881 0.17690 0.03021 0.02257 0.0002014
Proportion Explained  0.1587 0.1127 0.08404 0.04262 0.00728 0.00544 0.0000500
Cumulative Proportion 0.1587 0.2714 0.35548 0.39810 0.40538 0.41081 0.4108600

                         CA1    CA2     CA3     CA4     CA5     CA6     CA7
Eigenvalue            0.7434 0.6008 0.36668 0.33403 0.28447 0.09554 0.02041
Proportion Explained  0.1791 0.1447 0.08834 0.08047 0.06853 0.02302 0.00492
Cumulative Proportion 0.5900 0.7347 0.82306 0.90353 0.97206 0.99508 1.00000

Accumulated constrained eigenvalues

Importance of components:
                        CCA1   CCA2   CCA3   CCA4    CCA5    CCA6      CCA7
Eigenvalue            0.6587 0.4680 0.3488 0.1769 0.03021 0.02257 0.0002014
Proportion Explained  0.3863 0.2744 0.2045 0.1037 0.01772 0.01323 0.0001200
Cumulative Proportion 0.3863 0.6607 0.8652 0.9689 0.98665 0.99988 1.0000000

Scaling 2 for species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions

Species scores

                 CCA1     CCA2    CCA3      CCA4      CCA5       CCA6
S.marinoi     -0.3890  0.39759  0.1080 -0.005704 -0.005372 -0.0002441
C.tripos       1.8428  0.23999 -0.1661 -1.337082  0.636225 -0.5204045
P.alata        1.6892  0.17910 -0.3119  0.997590  0.142028  0.0601177
P.seriata      1.4365 -0.15112 -0.8646  0.915351 -1.455675 -1.4054078
D.confervacea  0.2098 -1.23522  0.5317 -0.089496 -0.034250  0.0278820
C.decipiens    2.2896  0.65801 -1.0315 -1.246933 -0.428691  0.3649382
P.farcimen    -1.2897 -1.19148 -2.3562  0.032558  0.104148 -0.0068910
C.furca        1.4439 -0.02836 -0.9459  0.301348 -0.975261  0.4861669

Biplot scores for constraining variables

                CCA1    CCA2     CCA3     CCA4     CCA5     CCA6
Temperature  0.88651  0.1043 -0.07283 -0.30912 -0.22541  0.24771
Salinity     0.32228 -0.3490  0.30471  0.05140 -0.32600  0.44408
O2          -0.81650  0.4665 -0.07151  0.03457  0.20399 -0.20298
Phosphate    0.22667 -0.8415  0.41741 -0.17725 -0.06941 -0.06605
TotP        -0.33506 -0.6371  0.38858 -0.05094 -0.24700 -0.25107
Nitrate      0.15520 -0.3674  0.38238 -0.07154 -0.41349 -0.56582
TotN        -0.23253 -0.3958  0.16550 -0.25979 -0.39029 -0.68259
Silica       0.04449 -0.8382  0.15934 -0.22951 -0.35540 -0.25650

Which of all these numbers are important to my analysis? /anna

like image 403
user3420443 Avatar asked Mar 20 '14 14:03

user3420443


People also ask

How do you explain CCA?

When figuring taxable income, taxpayers can claim annual deductions on their depreciable assets through the Capital Cost Allowance (CCA). The CCA is allowable when purchases are anticipated to last for years, such as equipment and machinery.

What is inertia in CCA?

For linear methods, the inertia represents the variance in species abundance (or transformed species abundance), but in unimodal methods, it represents the variance or spread of species scores.


1 Answers

How much variation is explained by X?

In a CCA, variance isn't variance in the normal sense. We express it as the "mean squared contingency coefficient", or "inertia". All the info you need to ascertain how much "variation" in Y is explained by X is contained in the section of the output that I reproduce below:

Partitioning of mean squared contingency coefficient:
              Inertia Proportion
Total           4.151     1.0000
Constrained     1.705     0.4109
Unconstrained   2.445     0.5891

In this example there is total inertia 4.151 and your X variables (these are "Constraints") explain a total of 1.705 bits of inertia, which is about 41%, leaving about 59% unexplained.

The next section referring to eigenvalues allows you to see both in terms of inertia explained and proportion explained which axes contribute significantly to the explanatory "power" of the CCA (the Constrained part of the table above) and the unexplained "variance" (the Unconstrained part of the table above.

The next section contains the ordination scores. Think of these as the coordinates of the points in the triplot. For some reason you show the site scores in the output above, but they would normally be there. Note that these have been scaled - by default this is using scaling = 2 - so site points are at their weighted average of the species scores IIRC etc.

The "Biplot" scores are the locations of the arrow heads or the labels on the arrows - I forget exactly how the plot is drawn now.

Which of all these numbers are important to my analysis?

All of them are important - if you think the triplot is important an interpretable, it is based entirely on the information reported by summary(). If you have specific questions to ask of the data, then perhaps only certain sections will be of paramount importance to you.

However, StackOverflow is not the place to ask such questions of a statistical nature.

like image 114
Gavin Simpson Avatar answered Oct 02 '22 05:10

Gavin Simpson