how to draw samples with kernel-density-estimation

Question

in a kernel density estimation the density of a arbitory point in space can be estimated by (wiki):

$kde$

in sklearn it is possible to draw samples from this distribution:

kde = KernelDensity().fit(z)  # fit kde
z_sampled = kde.sample(100)   # draw 100 samples

is there an explicit formular to draw samples from such a distribution?

sascha · Accepted Answer

It depends on the kernel.

But the general approach is simple. Let's assume a gaussian-kernel here:

Chose one original point x uniformly from X
Draw a value from the kernel linked to this point:
- Gaussian: sample = Gaussian/Normal(x, b) (x=mean; b=standard deviation) where x = uniformly chosen point and b=Bandwidth.

Yes, there is no fitting-needed for sampling. Everything just depends on the original data and the bandwith parameter!

Compare with sklearn's implementation:

i = rng.randint(data.shape[0], size=n_samples)

if self.kernel == 'gaussian':
    return np.atleast_2d(rng.normal(data[i], self.bandwidth))

where i omitted the underlying tree-structure needed for accessing data[i]. np.atleast_2d is just there to be compatible to sklearn's API.

Donate For Us