I want the correlations between individual variables and principal components in python. I am using PCA in sklearn. I don't understand how can I achieve the loading matrix after I have decomposed my data? My code is here.
iris = load_iris() data, y = iris.data, iris.target pca = PCA(n_components=2) transformed_data = pca.fit(data).transform(data) eigenValues = pca.explained_variance_ratio_
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html doesn't mention how this can be achieved.
Principal component analysis (PCA). Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space. The input data is centered but not scaled for each feature before applying the SVD.
PCA loadings are the coefficients of the linear combination of the original variables from which the principal components (PCs) are constructed.
Loadings are interpreted as the coefficients of the linear combination of the initial variables from which the principal components are constructed. From a numerical point of view, the loadings are equal to the coordinates of the variables divided by the square root of the eigenvalue associated with the component.
Performing PCA using Scikit-Learn is a two-step process: Initialize the PCA class by passing the number of components to the constructor. Call the fit and then transform methods by passing the feature set to these methods. The transform method returns the specified number of principal components.
Multiply each component by the square root of its corresponding eigenvalue:
pca.components_.T * np.sqrt(pca.explained_variance_)
This should produce your loading matrix.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With