Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kullback-Leibler (KL) distance between histograms - matlab

function [ d ] = hcompare_KL( h1,h2 )
%This routine evaluates the Kullback-Leibler (KL) distance between histograms. 
%             Input:      h1, h2 - histograms
%             Output:    d – the distance between the histograms.
%             Method:    KL is defined as: 
%             Note, KL is not symmetric, so compute both sides.
%             Take care not to divide by zero or log zero: disregard entries of the sum      for which with H2(i) == 0.

temp = sum(h1 .* log(h1 ./ h2));
temp( isinf(temp) ) = 0; % this resloves where h1(i) == 0 
d1 = sum(temp);

temp = sum(h2 .* log(h2 ./ h1)); % other direction of compare since it's not symetric
temp( isinf(temp) ) = 0;
d2 = sum(temp);

d = d1 + d2;

end

my problem is that whenever h1(i) or h2(i) == 0 i'm getting inf which is as expected. however in the KL distance i'm suppose to return 0 whenever they h1 or h2 ==0 how can i do that without using a loop ?

like image 552
Gilad Avatar asked Nov 13 '12 22:11

Gilad


1 Answers

To avoid having issues when any of the counts is 0, I suggest you create an index that marks the "good" data points:

%# you may want to do some input testing, such as whether h1 and h2 are
%# of the same size

%# preassign the output
d = zeros(size(h1));

%# create an index of the "good" data points
goodIdx = h1>0 & h2>0; %# bin counts <0 are not good, either

d1 = sum(h1(goodIdx) .* log(h1(goodIdx) . /h2(goodIdx)));
d2 = sum(h2(goodIdx) .* log(h2(goodIdx) . /h1(goodIdx)));

%# overwrite d only where we have actual data
%# the rest remains zero
d(goodIdx) = d1 + d2;
like image 188
Jonas Avatar answered Oct 11 '22 17:10

Jonas