Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - 'princomp' can only be used with more units than variables

I am using R software (R commander) to cluster my data. I have a smaller subset of my data containing 200 rows and about 800 columns. I am getting the following error when trying kmeans cluster and plot on a graph. "'princomp' can only be used with more units than variables"

I then created a test doc of 10 row and 10 columns whch plots fine but when I add an extra column I get te error again. Why is this? I need to be able to plot my cluster. When I view my data set after performing kmeans on it I can see the extra results column which shows which clusters they belong to.

IS there anything I am doing wrong, can I ger rid of this error and plot my larger sample??? Please help, been wrecking my head for a week now. Thanks guys.

like image 657
CoolSteve Avatar asked Apr 16 '11 13:04

CoolSteve


1 Answers

The problem is that you have more variables than sample points and the principal component analysis that is being done is failing.

In the help file for princomp it explains (read ?princomp):

 ‘princomp’ only handles so-called R-mode PCA, that is feature
 extraction of variables.  If a data matrix is supplied (possibly
 via a formula) it is required that there are at least as many
 units as variables.  For Q-mode PCA use ‘prcomp’.
like image 69
jberg Avatar answered Sep 29 '22 12:09

jberg