Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to draw samples with kernel-density-estimation

in a kernel density estimation the density of a arbitory point in space can be estimated by (wiki):

kde

in sklearn it is possible to draw samples from this distribution:

kde = KernelDensity().fit(z)  # fit kde
z_sampled = kde.sample(100)   # draw 100 samples

is there an explicit formular to draw samples from such a distribution?

like image 742
Oliver Wilken Avatar asked Mar 07 '23 22:03

Oliver Wilken


1 Answers

It depends on the kernel.

But the general approach is simple. Let's assume a gaussian-kernel here:

  • Chose one original point x uniformly from X
  • Draw a value from the kernel linked to this point:
    • Gaussian: sample = Gaussian/Normal(x, b) (x=mean; b=standard deviation) where x = uniformly chosen point and b=Bandwidth.

Yes, there is no fitting-needed for sampling. Everything just depends on the original data and the bandwith parameter!

Compare with sklearn's implementation:

i = rng.randint(data.shape[0], size=n_samples)

if self.kernel == 'gaussian':
    return np.atleast_2d(rng.normal(data[i], self.bandwidth))

where i omitted the underlying tree-structure needed for accessing data[i]. np.atleast_2d is just there to be compatible to sklearn's API.

like image 55
sascha Avatar answered Mar 15 '23 03:03

sascha