Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reducing dimensionality on training data with PCA in Matlab

This is a follow up question to:

PCA Dimensionality Reduction

In order to classify the new 10 dimensional test data do I have to reduce the training data down to 10 dimensions as well?

I tried:

X = bsxfun(@minus, trainingData, mean(trainingData,1));           
covariancex = (X'*X)./(size(X,1)-1);                 
[V D] = eigs(covariancex, 10);   % reduce to 10 dimension
Xtrain = bsxfun(@minus, trainingData, mean(trainingData,1));  
pcatrain = Xtest*V;

But using the classifier with this and the 10 dimensional testing data produces very unreliable results? Is there something that I am doing fundamentally wrong?

Edit:

X = bsxfun(@minus, trainingData, mean(trainingData,1));           
covariancex = (X'*X)./(size(X,1)-1);                 
[V D] = eigs(covariancex, 10);   % reduce to 10 dimension
Xtrain = bsxfun(@minus, trainingData, mean(trainingData,1));  
pcatrain = Xtest*V;

X = bsxfun(@minus, pcatrain, mean(pcatrain,1));           
covariancex = (X'*X)./(size(X,1)-1);                 
[V D] = eigs(covariancex, 10);   % reduce to 10 dimension
Xtest = bsxfun(@minus, test, mean(pcatrain,1));  
pcatest = Xtest*V;
like image 758
user3094936 Avatar asked Mar 21 '23 07:03

user3094936


1 Answers

You have to reduce both training and test data, but both in the same way. So once you got your reduction matrix from PCA on the training data, you have to use this matrix to reduce dimensionality of the test data. In short words, you need one, constant transformation which is applied to both training and testing elements.

Using your code

% first, 0-mean data
Xtrain = bsxfun(@minus, Xtrain, mean(Xtrain,1));           
Xtest  = bsxfun(@minus, Xtest, mean(Xtrain,1));           

% Compute PCA
covariancex = (Xtrain'*Xtrain)./(size(Xtrain,1)-1);                 
[V D] = eigs(covariancex, 10);   % reduce to 10 dimension

pcatrain = Xtrain*V;
% here you should train your classifier on pcatrain and ytrain (correct labels)

pcatest = Xtest*V;
% here you can test your classifier on pcatest using ytest (compare with correct labels)
like image 67
lejlot Avatar answered Apr 25 '23 13:04

lejlot