This is a follow up question to:
PCA Dimensionality Reduction
In order to classify the new 10 dimensional test data do I have to reduce the training data down to 10 dimensions as well?
I tried:
X = bsxfun(@minus, trainingData, mean(trainingData,1));
covariancex = (X'*X)./(size(X,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
Xtrain = bsxfun(@minus, trainingData, mean(trainingData,1));
pcatrain = Xtest*V;
But using the classifier with this and the 10 dimensional testing data produces very unreliable results? Is there something that I am doing fundamentally wrong?
Edit:
X = bsxfun(@minus, trainingData, mean(trainingData,1));
covariancex = (X'*X)./(size(X,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
Xtrain = bsxfun(@minus, trainingData, mean(trainingData,1));
pcatrain = Xtest*V;
X = bsxfun(@minus, pcatrain, mean(pcatrain,1));
covariancex = (X'*X)./(size(X,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
Xtest = bsxfun(@minus, test, mean(pcatrain,1));
pcatest = Xtest*V;
You have to reduce both training and test data, but both in the same way. So once you got your reduction matrix from PCA on the training data, you have to use this matrix to reduce dimensionality of the test data. In short words, you need one, constant transformation which is applied to both training and testing elements.
Using your code
% first, 0-mean data
Xtrain = bsxfun(@minus, Xtrain, mean(Xtrain,1));
Xtest = bsxfun(@minus, Xtest, mean(Xtrain,1));
% Compute PCA
covariancex = (Xtrain'*Xtrain)./(size(Xtrain,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
pcatrain = Xtrain*V;
% here you should train your classifier on pcatrain and ytrain (correct labels)
pcatest = Xtest*V;
% here you can test your classifier on pcatest using ytest (compare with correct labels)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With