How do I see what bandwidth gets used for kernels in a density plot and how do I specify a bandwidth to be used? I tried
ggplot(mtcars,aes(mpg))+geom_density(bw=1)
with no luck.
Changing the bandwidth changes the shape of the kernel: a lower bandwidth means only points very close to the current position are given any weight, which leads to the estimate looking squiggly; a higher bandwidth means a shallow kernel where distant points can contribute.
The density() function in R computes the values of the kernel density estimate. Applying the plot() function to an object created by density() will plot the estimate. Applying the summary() function to the object will reveal useful statistics about the estimate.
The bandwidth defines how close to r the distance between two points must be to influence the estimation of the density at r. A small bandwidth only considers the closest values so the estimation is close to the data. A large bandwidth considers more points and gives a smoother estimation.
You can set multiple properties within the geom_density layer. One of them is the adjust property – multiplicative bandwidth adjustment. Default value is set to 1. The following figure shows how the density plot changes when the adjust property is reduced.
stat_geom
utilises the adjust
argument to apply a multiplier to the optimal bandwidth that ggplot calculates see documentation for density()
. Try:
ggplot(mtcars,aes(mpg))+geom_density() + stat_density(adjust = 2)
I gather to determine the calculated optimal bandwidth - based on "the standard deviation of the smoothing kernel" - you'll need to interrogate Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. New York: Springer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With