Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keep pandas index while applying sklearn

I have a dataset which has a DateTime index and I'm using PCA from sklearn to reduce the number of dimensions.

The following question bugs me - will PCA keep the order of the points in my series so that I can reuse the index from the original dataframe?

df = pd.DataFrame(...)
df2 = pca.fit_transform(df)
df2.index = df.index

Moreover, is there a better (safer) approach than doing this?

like image 400
Kobe-Wan Kenobi Avatar asked Nov 09 '22 02:11

Kobe-Wan Kenobi


1 Answers

Though the indices are removed by PCA but the underlying order of rows remains(see implementation for the transform function of PCA*). So it is safe to have df2.index = df1.index

*fit_transform is same as fit and then transform. None of them reorder the rows.

like image 199
user2685079 Avatar answered Jan 04 '23 03:01

user2685079