Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sklearn PCA decomposition explained_variance_ratio_

I am a python rookie, these days I was learning PCA decomposition, when I use the explained_variance_ratio_ I found that the results are sorted by default by default like these:

Ratio: [9.99067005e-01 8.40367350e-04 4.97276068e-05 2.46358647e-05 1.00120681e-05 8.25213366e-06]

This is my previous operation:

from sklearn.decomposition import PCA
my_pca = PCA(n_components=7)
new_df = df.drop(labels=["salary","department","left"],axis=1)
low_mat = my_pca.fit_transform(new_df)
print("Ratio:",my_pca.explained_variance_ratio_)

I was so confused which was the most important components, so I want to know do you have some ways to let the composition and ratio one-to-one correspondence, like this:

Ratio: satisfaction_level 9.99067005e-01
......

thank you!

like image 777
苏世杰 Avatar asked Jan 30 '26 16:01

苏世杰


1 Answers

Since you have not mentioned what is satisfaction_level, I assume its a feature in your data set. Also I assume that you are expecting feature wise variance values.

PCA has parameter called n_components which indicates the number of components you want to keep in a transferred space. PCA is used for dimensionality reduction.So n_components has to be less than number of features you have.

PCA will do dimensionality reduction by rotating the features to get the maximum variance. Each feature in PCA will be orthogonal to each other. So you will not be able to see the same features values as you see in your original data set.

The features in PCA will be transformed to get high variance. Higher the variance, higher the percentage of information is retained.

explained_variance_ratio_ is the percentage of variance explained by each of the selected components. First component will be having having higher variance & last component will be having least variance. The percentage values are sorted in decreasing order

So if you want to get the transformed features (the most important features), do this

pca_features = my_pca.components_

You can make a dataframe out of it as well.

P.S: Before applying PCA, make sure that you have standardised the input data

like image 194
Kalsi Avatar answered Feb 01 '26 08:02

Kalsi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!