I am trying to get density estimates for the log of stock prices in R. I know I can plot it using plot(density(x))
. However, I actually want values for the function.
I'm trying to implement the kernel density estimation formula. Here's what I have so far:
a <- read.csv("boi_new.csv", header=FALSE)
S = a[,3] # takes column of increments in stock prices
dS=S[!is.na(S)] # omits first empty field
N = length(dS) # Sample size
rseed = 0 # Random seed
x = rep(c(1:5),N/5) # Inputted data
set.seed(rseed) # Sets random seed for reproducibility
QL <- function(dS){
h = density(dS)$bandwidth
r = log(dS^2)
f = 0*x
for(i in 1:N){
f[i] = 1/(N*h) * sum(dnorm((x-r[i])/h))
}
return(f)
}
QL(dS)
Any help would be much appreciated. Been at this for days!
The density() function in R computes the values of the kernel density estimate. Applying the plot() function to an object created by density() will plot the estimate. Applying the summary() function to the object will reveal useful statistics about the estimate.
The KDE is calculated by weighting the distances of all the data points we've seen for each location on the blue line. If we've seen more points nearby, the estimate is higher, indicating that probability of seeing a point at that location.
Description. As known as Kernel Density Plots, Density Trace Graph. A Density Plot visualises the distribution of data over a continuous interval or time period. This chart is a variation of a Histogram that uses kernel smoothing to plot values, allowing for smoother distributions by smoothing out the noise.
The drawback of using kernal density estimation histogram method are- it results in discontinuous shape of the histogram. The data representation is poor. The data is represented vaguely and causes disruptions.
You can pull the values directly from the density
function:
x = rnorm(100)
d = density(x, from=-5, to = 5, n = 1000)
d$x
d$y
Alternatively, if you really want to write your own kernel density function, here's some code to get you started:
Set the points z
and x
range:
z = c(-2, -1, 2)
x = seq(-5, 5, 0.01)
Now we'll add the points to a graph
plot(0, 0, xlim=c(-5, 5), ylim=c(-0.02, 0.8),
pch=NA, ylab="", xlab="z")
for(i in 1:length(z)) {
points(z[i], 0, pch="X", col=2)
}
abline(h=0)
Put Normal density's around each point:
## Now we combine the kernels,
x_total = numeric(length(x))
for(i in 1:length(x_total)) {
for(j in 1:length(z)) {
x_total[i] = x_total[i] +
dnorm(x[i], z[j], sd=1)
}
}
and add the curves to the plot:
lines(x, x_total, col=4, lty=2)
Finally, calculate the complete estimate:
## Just as a histogram is the sum of the boxes,
## the kernel density estimate is just the sum of the bumps.
## All that's left to do, is ensure that the estimate has the
## correct area, i.e. in this case we divide by $n=3$:
plot(x, x_total/3,
xlim=c(-5, 5), ylim=c(-0.02, 0.8),
ylab="", xlab="z", type="l")
abline(h=0)
This corresponds to
density(z, adjust=1, bw=1)
The plots above give:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With