Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Principal Component Analysis on Weka

I have just computed PCA on a training set and Weka returned me the new attributes with the way in which they were selected and computed. Now, I want to build a model using these data and then use the model on a test set.

Do you know if there is a way to automatically modify the test set according to the new type of attributes?

like image 593
alexlipa Avatar asked Jun 26 '14 23:06

alexlipa


People also ask

How do you analyze principal component results?

To interpret each principal components, examine the magnitude and direction of the coefficients for the original variables. The larger the absolute value of the coefficient, the more important the corresponding variable is in calculating the component.

What is principal component analysis?

Principal component analysis, or PCA, is a statistical procedure that allows you to summarize the information content in large data tables by means of a smaller set of “summary indices” that can be more easily visualized and analyzed.

What is principal component analysis explain with an example?

Principal Component Analysis is an unsupervised learning algorithm that is used for the dimensionality reduction in machine learning. It is a statistical process that converts the observations of correlated features into a set of linearly uncorrelated features with the help of orthogonal transformation.


2 Answers

Do you need the principal components for analysis or just to feed into the classifier? If not just use the Meta->FilteredClassifier classifier. Set the filter to PrincipalComponents and and the classifier to whatever classifier you want to use. Train it on the un-transformed training set and you'll be able to just feed it the untransformed test set.

If you really need the modified test set I'd recommend using the knowledge flow tool to make something like this: enter image description here

like image 97
user22320 Avatar answered Oct 04 '22 11:10

user22320


To perform this from the command line, the documentation can be found at: https://weka.wikispaces.com/Batch+filtering

Here is an example:

java weka.filters.supervised.attribute.AttributeSelection \
  -b -i train.arff -o train_pca.arff \
  -r test.arff -s test_pca_output.arff \
  -E "weka.attributeSelection.PrincipalComponents -R 0.95 -A 5" \
  -S "weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N -1" 
like image 34
Josep Valls Avatar answered Oct 04 '22 12:10

Josep Valls