Local linear regression in R -- locfit() vs locpoly()

Tags:

I am trying to understand the different behaviors of these two smoothing functions when given apparently equivalent inputs. My understanding was that locpoly just takes a fixed bandwidth argument, while locfit can also include a varying part in its smoothing parameter (a nearest-neighbors fraction, "nn"). I thought setting this varying part to zero in locfit should make the "h" component act like the fixed bandwidth used in locpoly, but this is evidently not the case.

A working example:

library(KernSmooth) library(locfit) set.seed(314)  n <- 100 x <- runif(n, 0, 1) eps <- rnorm(n, 0, 1) y <- sin(2 * pi * x) + eps  plot(x, y) lines(locpoly(x, y, bandwidth=0.05, degree=1), col=3) lines(locfit(y ~ lp(x, nn=0, h=0.05, deg=1)), col=4)

Produces this plot:

plot of smoothers

locpoly gives the smooth green line, and locfit gives the wiggly blue line. Clearly, locfit has a smaller "effective" bandwidth here, even though the supposed bandwidth parameter has the same value for each.

What are these functions doing differently?

920

asked Feb 02 '15 16:02

user1870614

1 Answers

The two parameters both represent smoothing, but they do so in two different ways.

locpoly's bandwidth parameter is relative to the scale of the x-axis here. For example, if you changed the line x <- runif(n, 0, 1) to x <- runif(n, 0, 10), you will see that the green locpoly line becomes much more squiggly despite the fact that you still have the same number of points (100).

locfit's smoothing parameter, h, is independent of the scale, and instead is based on a proportion of the data. The value 0.05 means 5% of the data that is closest to that position is used to fit the curve. So changing the scale would not alter the line.

This also explains the observation made in the comment that changing the value of h to 0.1 makes the two look nearly identical. This makes sense, because we can expect that a bandwidth of 0.05 will contain about 10% of the data if we have 100 points distributed uniformly from 0 to 1.

My sources include the documentation for the locfit package and the documentation for the locpoly function.

114

answered Sep 27 '22 22:09

znr

Related questions
                            
                                Add (subtract) months without exceeding the last day of the new month
                            
                                Should I avoid programming packages with pipe operators?
                            
                                Count unique values for every column
                            
                                Replacing occurrences of a number in multiple columns of data frame with another value in R
                            
                                Easy way of counting precision, recall and F1-score in R
                            
                                How to plot dendrograms with large datasets?
                            
                                Calculating cumulative sum for each row
                            
                                Creating arbitrary panes in ggplot2
                            
                                Find how many times duplicated rows repeat in R data frame [duplicate]
                            
                                R: Calculate and interpret odds ratio in logistic regression
                            
                                Is it possible to insert (add) a row to a SQLite db table using dplyr package?
                            
                                Reproduce table and plot from journal
                            
                                How to create a raster from a data frame in r?
                            
                                How do I preserve transparency in ggplot2?
                            
                                r random forest error - type of predictors in new data do not match
                            
                                Categorize numeric variable into group/ bins/ breaks
                            
                                How to sort a data frame by alphabetic order of a character variable in R?
                            
                                Change row order in a matrix/dataframe
                            
                                Display an axis value in millions in ggplot
                            
                                R how to visualize confusion matrix using the caret package

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Local linear regression in R -- locfit() vs locpoly()

Tags:

r

smoothing

regression

user1870614

People also ask

1 Answers

znr

Recent Activity

Donate For Us