Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Peak of the kernel density estimation

I need to find as precisely as possible the peak of the kernel density estimation (modal value of the continuous random variable). I can find the approximate value:

x<-rlnorm(100)
d<-density(x)
plot(d)
i<-which.max(d$y)
d$y[i]
d$x[i]

But when calculating d$y precise function is known. How can I locate the exact value of the mode?

like image 857
Andy Avatar asked Apr 27 '13 18:04

Andy


People also ask

How is kernel density estimation calculated?

Kernel Density Estimation (KDE) It is estimated simply by adding the kernel values (K) from all Xj. With reference to the above table, KDE for whole data set is obtained by adding all row values. The sum is then normalized by dividing the number of data points, which is six in this example.

What is kernel density estimation plot?

A kernel density estimate (KDE) plot is a method for visualizing the distribution of observations in a dataset, analagous to a histogram. KDE represents the data using a continuous probability density curve in one or more dimensions. The approach is explained further in the user guide.

What is density in kernel density estimation?

In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on kernels as weights.

Why is kernel density estimation used?

Kernel density estimation is a technique for estimation of probability density function that is a must-have enabling the user to better analyse the studied probability distribution than when using a traditional histogram.


3 Answers

If I understand your question, I think you are just wanting a finer discretisation of x and y. To do this, you can change the value of n in the density function (default is n=512).

For example, compare

set.seed(1)
x = rlnorm(100)
d = density(x)
i = which.max(d$y)
d$y[i]; d$x[i]
0.4526; 0.722

with:

d = density(x, n=1e6)
i = which.max(d$y)
d$y[i]; d$x[i]
0.4525; 0.7228
like image 196
csgillespie Avatar answered Oct 15 '22 13:10

csgillespie


Here are two functions for dealing with modes. The dmode function finds the mode with the highest peak (dominate mode) and n.modes identify the number of modes.

    dmode <- function(x) {
      den <- density(x, kernel=c("gaussian"))
        ( den$x[den$y==max(den$y)] )   
    }  

    n.modes <- function(x) {  
       den <- density(x, kernel=c("gaussian"))
       den.s <- smooth.spline(den$x, den$y, all.knots=TRUE, spar=0.8)
         s.0 <- predict(den.s, den.s$x, deriv=0)
         s.1 <- predict(den.s, den.s$x, deriv=1)
       s.derv <- data.frame(s0=s.0$y, s1=s.1$y)
       nmodes <- length(rle(den.sign <- sign(s.derv$s1))$values)/2
       if ((nmodes > 10) == TRUE) { nmodes <- 10 }
          if (is.na(nmodes) == TRUE) { nmodes <- 0 } 
       ( nmodes )
    }

# Example
x <- runif(1000,0,100)
  plot(density(x))
    abline(v=dmode(x))
like image 41
Jeffrey Evans Avatar answered Oct 15 '22 14:10

Jeffrey Evans


I think you need two steps to archive what you need:

1) Find the x-Axis value of the KDE peak

2) Get the desnity value of the peak

So (if you dont mind using a package) a solution using the hdrcde package would look like this:

require(hdrcde)

x<-rlnorm(100)
d<-density(x)

# calcualte KDE with help of the hdrcde package
hdrResult<-hdr(den=d,prob=0)

# define the linear interpolation function for the density estimation
dd<-approxfun(d$x,d$y)
# get the density value of the KDE peak
vDens<-dd(hdrResult[['mode']])

Edit: You could also use the

hdrResult[['falpha']]

if it is precise enough for you!

like image 22
Deset Avatar answered Oct 15 '22 14:10

Deset