Could anyone provide a simple numeric example of the EM algorithm as I am not sure about the formulas given? A really simple one with 4 or 5 Cartesian coordinates would perfectly do.
Usage of EM algorithm –It can be used to fill the missing data in a sample. It can be used as the basis of unsupervised learning of clusters. It can be used for the purpose of estimating the parameters of Hidden Markov Model (HMM). It can be used for discovering the values of latent variables.
The EM algorithm is used to find (local) maximum likelihood parameters of a statistical model in cases where the equations cannot be solved directly. Typically these models involve latent variables in addition to unknown parameters and known data observations.
The expectation maximization algorithm is a natural generalization of maximum likelihood estimation to the incomplete data case. In particular, expectation maximization attempts to find the parameters θ̂ that maximize the log probability logP(x;θ) of the observed data.
Expectation maximization Like K-means, it's iterative, alternating two steps, E and M, which correspond to estimating hidden variables given the model and then estimating the model given the hidden variable estimates. Unlike K-means, the cluster assignments in EM for Gaussian mixtures are soft.
what about this: http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clustering/Expectation_Maximization_(EM)#A_simple_example
I had also written a simple example in (edit)R a year ago, unfortunately I am unable to locate it. I'll try again to find it later.
EDIT: Here it is -
EM <- function() { ### Read file, get necessary cols dataFile <- read.csv("wine.csv", head = FALSE, sep = ",") sl <- dataFile[, 2] #sw <- dataFile[, 3] #pl <- dataFile[, 3] #pw <- dataFile[, 4] class <- dataFile[, 5] N <- length(sl) pi1 <- 0.5 ### Init ### rand1 <- floor(runif(1) * N) rand2 <- floor(runif(1) * N) mu1 <- sl[rand1] mu2 <- sl[rand2] mean1 <- sum(sl)/N sigma1 <- sum( (sl - mean1) ** 2) / N sigma2 <- sigma1 print(mu1) print(mu2) print(sigma1) print(sigma2) COUNTLIM <- 10 count <- 1 prevmu1 <- 0.0; prevmu2 <- 0.0; prevsigma1 <- 0.0; prevsigma2 <- 0.0; gamma <- array(0, length(sl)) while (count <= COUNTLIM) { gamma <- pi1 * dnorm(sl, mu2, sigma2)/ ( (1 - pi1) * dnorm(sl, mu1, sigma1) + pi1 * dnorm(sl, mu2, sigma2)) mu1 <- sum((1 - gamma) * sl) / sum(1 - gamma)
mu2 <- sum((gamma) * sl) / sum(gamma)
sigma1 <- sum((1 - gamma) * (sl - mu1) ** 2)/sum(1 - gamma) sigma2 <- sum((gamma) * (sl - mu2) ** 2)/sum(gamma) pi1 <- sum(gamma)/N print(c(mu1, mu2, sigma1, sigma2, pi1)) if (count == 1) { prevmu1 <- mu1; prevmu2 <- mu2; prevsigma1 <- sigma1; prevsigma2 <- sigma2; } else { val <- ((prevmu1 - mu1)*2 + (prevmu2 - mu2)*2 + (prevsigma1 - sigma1)*2 + (prevsigma2 - sigma2)*2) ** 0.5; print(c("val: " , val)) if (val <= 1) { break; } } count <- count + 1 } print(mu1) print(mu2) print(sigma1) print(sigma2) }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With