in a kernel density estimation the density of a arbitory point in space can be estimated by (wiki):
in sklearn it is possible to draw samples from this distribution:
kde = KernelDensity().fit(z) # fit kde
z_sampled = kde.sample(100) # draw 100 samples
is there an explicit formular to draw samples from such a distribution?
It depends on the kernel.
But the general approach is simple. Let's assume a gaussian-kernel here:
x
uniformly from X
sample = Gaussian/Normal(x, b)
(x=mean; b=standard deviation
) where x = uniformly chosen point
and b=Bandwidth
.Yes, there is no fitting-needed for sampling. Everything just depends on the original data and the bandwith parameter!
Compare with sklearn's implementation:
i = rng.randint(data.shape[0], size=n_samples)
if self.kernel == 'gaussian':
return np.atleast_2d(rng.normal(data[i], self.bandwidth))
where i omitted the underlying tree-structure needed for accessing data[i]
. np.atleast_2d
is just there to be compatible to sklearn's API.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With