Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: locpoly is incorrectly returning NaN

Running the following code gives me a NaN:

library(KernSmooth) 
x <- c(5.84155992364115, 1.55292112974119, 0.0349665318792623, 3.93053647398094,
       3.42790577684633, 2.9715553006801, 0.837108410045353, 2.872476865277, 
       3.89232548092257, 0.206399650539628) 
y <- c(0.141415317472329, 1.34799648955049, 0.0297566221758204, 
       -0.966736679061812, 0.246306732122746, 0.557982376254723, 
       0.740542828791083, 0.162336127802977, -0.428804158514744, 
       0.691280978689863) 

locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]

I get

[1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947  0.4441603
[7]  0.1425592 -0.3600028 -0.7840411 -1.0517612 -1.2690134        NaN

On another computer, I get the same, except I get -0.7270521 instead of NaN. I am guessing that most of you will also get that. So the question is how do I fix my broken system? Does this have to do with my LAPACK or LIBBLAS?

Note that both computers mentioned above use Ubuntu. The one that gave NaN uses Ubuntu 13.10, the one that gave a number is on 12.04.

EDIT:

My new suspicion is that it is a floating point calculation issue: A local polynomial regression is just a weighted linear regression, where the weights decrease the further the point is away from the point of evaluation, in this case 5.84. One should note that the bandwidth is small so a first thought is that there are no points within the bandwidth. However, locpoly uses a Gaussian kernel, so that all points have strictly positive weight. My guess is that the weights are so small though that rounding or floating point calculation can be a problem. I'm not sure how to fix that.

like image 449
Xu Wang Avatar asked Mar 16 '14 05:03

Xu Wang


2 Answers

Not an answer, but wanted to post a graph. I'm still not clear on what you expected to get from locpoly, but here it is.

Rgames> foo<-locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)
Rgames> foo
$x
 [1] 0.03496653 0.56283866 1.09071078 1.61858291 2.14645504 2.67432716
 [7] 3.20219929 3.73007142 4.25794354 4.78581567 5.31368780 5.84155992

$y
 [1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947  0.4441603
 [7]  0.1425592 -0.3600028 -0.7840411 -1.0517612 -1.2690134        NaN

enter image description here My suspicion is that last point on the far right diverges for the fitting parameters in use, and it was dumb luck that you got a non-NaN value under any OS.

like image 157
Carl Witthoft Avatar answered Oct 09 '22 13:10

Carl Witthoft


If I am using Windows 7 and R 3.0, I get:

 > locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]
 [1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947
 [6]  0.4441603  0.1425592 -0.3600028 -0.7840411 -1.0517612
[11] -1.2690134 -2.8078788

So your issue wasn't there. However if I use R 3.0 on Ubuntu 13.04 (GNU/Linux 3.8.0-23-generic x86_64) I get:

 > locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]

 [1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947  0.4441603
 [7]  0.1425592 -0.3600028 -0.7840411 -1.0517612 -1.2690134        NaN

I tried experimenting and was able to get numbers very similar to what I got in Windows 7 by using:

> locpoly(round(x,3), round(y,3), bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]

 [1]  0.3032295  0.6459197  0.9533132  1.1121400  0.8118960  0.4437407
 [7]  0.1422658 -0.3604210 -0.7848982 -1.0531299 -1.2710219 -0.7269588

So I hope that is able to solve your second problem.

In order to figure out why I was able to get non-NaN answer with Windows, but not Ubuntu, we can look at http://cran.r-project.org/web/packages/KernSmooth/index.html and notice that:

MacOS X binary: KernSmooth_2.23-10.tgz Windows binary: KernSmooth_2.23-11.zip

Naturally there are two different versions, but the Windows binary is one version further than MacOS X binary. I checked out the sourcecode for the functions in Ubuntu and Windows and they look to be the same. However, I did find this Rounding differences on Windows vs Unix based system in sprintf showing that there is a reported bug for differences in rounding between unix and windows. Although that was asked 3 years ago. So I would say the difference might be OS or version for KernSmooth (would lean toward OS as others have also encountered that issue)

like image 29
James Tobin Avatar answered Oct 09 '22 11:10

James Tobin