In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. - Wikipedia

Gaussian Mixture Models (GMMs)

GMM: Superposition of K Gaussian densities of the following form is mixture of Gaussians.

p(x)=k=1KπkN(x|μk,Σk)

Mixture coefficients: πk (prior probability of a point belonging to cluster k)
We get the bottom relation by integrating both sides on the above equation.

k=1Kπk=1

Likelihood Function:

L(θ|x)=i=1np(xi)=i=1nk=1KπkN(xi|μk,Σk)

where θ={π1,...,πK,μ1,...,μK,Σ1,...,ΣK}.

Log Likelihood:

logL(θ|x)=i=1nlogk=1KπkN(xi|μk,Σk)

This is a hard problem.

E-step
γik=p(k|xi)=πkN(xi|μk,Σk)j=1KπjN(xi|μj,Σj)

M-step

Nk=i=1nγikμknew=1Nki=1nγikxiΣknew=1Nki=1nγik(xiμknew)(xiμknew)Tπknew=Nkn

Advantages

How to choose K?