Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KL-Divergence of two GMMs

I have two GMMs that I used to fit two different sets of data in the same space, and I would like to calculate the KL-divergence between them.

Currently I am using the GMMs defined in sklearn (http://scikit-learn.org/stable/modules/generated/sklearn.mixture.GMM.html) and the SciPy implementation of KL-divergence (http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.stats.entropy.html)

How would I go about doing this? Do I want to just create tons of random points, get their probabilities on each of the two models (call them P and Q) and then use those probabilities as my input? Or is there some more canonical way to do this within the SciPy/SKLearn environment?

like image 385
Andrew Latham Avatar asked Sep 27 '14 22:09

Andrew Latham


People also ask

What is the KL divergence between two equal distributions?

The Kullback-Leibler Divergence score, or KL divergence score, quantifies how much one probability distribution differs from another probability distribution. The KL divergence between two distributions Q and P is often stated using the following notation: KL(P || Q)

What is a good value for KL divergence?

If two distributions perfectly match, D_{KL} (p||q) = 0 otherwise it can take values between 0 and ∞. Lower the KL divergence value, the better we have matched the true distribution with our approximation.

What is true about Kullback-Leibler divergence?

The Kullback-Leibler divergence is not symmetric, i.e., KL(p||q)≠KL(q||p) and it can be shown that it is a nonnegative quantity (the proof is similar to the proof that the mutual information is nonnegative; see Problem 12.16 of Chapter 12). Moreover, it is zero if and only if p = q.

Can KL divergence be used as a distance measure?

Although the KL divergence measures the “distance” between two distri- butions, it is not a distance measure. This is because that the KL divergence is not a metric measure. It is not symmetric: the KL from p(x) to q(x) is generally not the same as the KL from q(x) to p(x).


1 Answers

There's no closed form for the KL divergence between GMMs. You can easily do Monte Carlo, though. Recall that KL(p||q) = \int p(x) log(p(x) / q(x)) dx = E_p[ log(p(x) / q(x)). So:

def gmm_kl(gmm_p, gmm_q, n_samples=10**5):
    X = gmm_p.sample(n_samples)
    log_p_X, _ = gmm_p.score_samples(X)
    log_q_X, _ = gmm_q.score_samples(X)
    return log_p_X.mean() - log_q_X.mean()

(mean(log(p(x) / q(x))) = mean(log(p(x)) - log(q(x))) = mean(log(p(x))) - mean(log(q(x))) is somewhat cheaper computationally.)

You don't want to use scipy.stats.entropy; that's for discrete distributions.

If you want the symmetrized and smoothed Jensen-Shannon divergence KL(p||(p+q)/2) + KL(q||(p+q)/2) instead, it's pretty similar:

def gmm_js(gmm_p, gmm_q, n_samples=10**5):
    X = gmm_p.sample(n_samples)
    log_p_X, _ = gmm_p.score_samples(X)
    log_q_X, _ = gmm_q.score_samples(X)
    log_mix_X = np.logaddexp(log_p_X, log_q_X)

    Y = gmm_q.sample(n_samples)
    log_p_Y, _ = gmm_p.score_samples(Y)
    log_q_Y, _ = gmm_q.score_samples(Y)
    log_mix_Y = np.logaddexp(log_p_Y, log_q_Y)

    return (log_p_X.mean() - (log_mix_X.mean() - np.log(2))
            + log_q_Y.mean() - (log_mix_Y.mean() - np.log(2))) / 2

(log_mix_X/log_mix_Y are actually the log of twice the mixture densities; pulling that out of the mean operation saves some flops.)

like image 147
Danica Avatar answered Sep 20 '22 21:09

Danica