I am using mclust
to see various clusters in my data set using various numbers of input (X,Y,Z,R, and S in the script below):
e.g.
elements<-cbind(X,Y,Z,R,S)
dataclust<-Mclust(elements)
I just find out that the order of the input parameters matters and affect the results;
in other words elements <- cbind(X,Y,Z,R,S)
gives a different clusters than say elements-<cbind(Y,Z,X,R,S)
.
My understanding is that all the input parameters have the same weight and importance in the clustering analysis. am I wrong or is it a bug?
I have seen that in R 2.15.3 and 2 other R versions.
Any comment on or explanation of the above is appreciated.
Unfortunately, I am unable to comment or edit my previous comment, so I'm posting an answer. @m-dz set me on a path that I think has revealed a possible answer. Specifically:
> library(mclust)
__ ___________ __ _____________
/ |/ / ____/ / / / / / ___/_ __/
/ /|_/ / / / / / / / /\__ \ / /
/ / / / /___/ /___/ /_/ /___/ // /
/_/ /_/\____/_____/\____//____//_/ version 5.2.2
Type 'citation("mclust")' for citing this R package in publications.
> testDataA <- read.table("http://fimi.ua.ac.be/data/chess.dat")
> summary(Mclust(subset(testDataA, select = c(V1, V3, V5, V7, V9, V11))))
----------------------------------------------------
Gaussian finite mixture model fitted by EM algorithm
----------------------------------------------------
Mclust EII (spherical, equal volume) model with 9 components:
log.likelihood n df BIC ICL
-3597.466 3196 63 -7703.32 -7735.137
Clustering table:
1 2 3 4 5 6 7 8 9
774 150 752 486 227 224 238 178 167
> summary(Mclust(subset(testDataA, select = c(V11, V9, V1, V3, V5, V7))))
----------------------------------------------------
Gaussian finite mixture model fitted by EM algorithm
----------------------------------------------------
Mclust EII (spherical, equal volume) model with 9 components:
log.likelihood n df BIC ICL
-3597.466 3196 63 -7703.32 -7735.137
Clustering table:
1 2 3 4 5 6 7 8 9
774 150 752 486 227 224 238 178 167
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.5
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mclust_5.2.2
loaded via a namespace (and not attached):
[1] tools_3.3.2
As you can see, this produces two solutions that match @m-dz's! However, what I was previously doing was loading the psych
package. I'm seeing now this is masking sim
from mclust
. I'm guessing this then causes the incorrect solutions:
> library(psych)
Attaching package: ‘psych’
The following object is masked from ‘package:mclust’:
sim
> testDataB <- read.file(f = "http://fimi.ua.ac.be/data/chess.dat")
Data from the .data file http://fimi.ua.ac.be/data/chess.dat has been loaded.
> summary(Mclust(subset(testDataB, select = c(X1, X3, X5, X7, X9, X11))))
----------------------------------------------------
Gaussian finite mixture model fitted by EM algorithm
----------------------------------------------------
Mclust EEV (ellipsoidal, equal volume and shape) model with 2 components:
log.likelihood n df BIC ICL
3547.068 3195 49 6698.738 6692.126
Clustering table:
1 2
2759 436
> summary(Mclust(subset(testDataB, select = c(X11, X9, X1, X3, X5, X7))))
----------------------------------------------------
Gaussian finite mixture model fitted by EM algorithm
----------------------------------------------------
Mclust EEV (ellipsoidal, equal volume and shape) model with 6 components:
log.likelihood n df BIC ICL
18473.94 3195 137 35842.37 35834.51
Clustering table:
1 2 3 4 5 6
431 932 210 881 524 217
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.5
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] psych_1.6.9 mclust_5.2.2
loaded via a namespace (and not attached):
[1] parallel_3.3.2 tools_3.3.2 foreign_0.8-67 mnormt_1.5-5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With