I have data of the form:
x y
0 0
0.01 1
0.03 0
0.04 1
0.04 0
x
is continuous from 0 to 1 and not equally spaced and y is binary.
I'd like to smooth y
over the x-axis using R, but can't find the right package. The kernel smoothing functions I've found produce density estimates of x
or will give the wrong estimate at the ends of the x because they'll average over regions less than 0 and greater than 1.
I'd also like to avoid linear smoothers like Loess givens then binary form of y
. The moving average functions I've seen assume equally-spaced x-values.
Do you know of any R functions that will smooth and ideally have a bandwidth selection procedure? I can write a moving average function and cross-validate to determine the bandwidth, but I'd prefer to find an existing function that's been vetted.
I would suggest using something like
d <- data.frame(x,y) ## not absolutely necessary but good practice
library(mgcv)
m1 <- gam(y~s(x),family="binomial",data=d)
This will (1) respect the binary nature of the data (2) do automatic degree-of-smoothness ("bandwidth" in your terminology) selection, using generalized cross-validation.
Use
plot(y~x, data=d)
pp <- data.frame(x=seq(0,1,length=101))
pp$y <- predict(m1,newdata=pp,type="response")
with(pp,lines(x,y))
or
library(ggplot2)
ggplot(d,aes(x,y))+geom_smooth(method="gam",family=binomial)
to get predictions/plot the results.
(I hope your real data set has more than 5 observations ... otherwise this will fail ...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With