I have the following code
import pandas as pd
import numpy as np
from sklearn.decomposition import TruncatedSVD
df = df = pd.DataFrame(np.random.randn(1000, 25), index=dates, columns=list('ABCDEFGHIJKLMOPQRSTUVWXYZ'))
def reduce(dim):
svd = sklearn.decomposition.TruncatedSVD(n_components=dim, n_iter=7, random_state=42)
return svd.fit(df)
fitted = reduce(5)
how do i get the column names from fitted
?
In continuation of Mikhail post.
Assume that you already have feature_names
from vectorizer.get_feature_names()
and after that you have called svd.fit(X)
Now you can also extract sorted best feature names using the following code:
best_fearures = [feature_names[i] for i in svd.components_[0].argsort()[::-1]]
The above code, try to return the arguement of descending sort of svd.components_[0]
and find the relative index from feature_names
(all of the features) and construct the best_features
array.
Then you can see for example the 10 best features:
In[21]: best_features[:10]
Out[21]:
['manag',
'develop',
'busi',
'solut',
'initi',
'enterprise',
'project',
'program',
'process',
'plan']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With