Question

Running the following code gives me a NaN:

library(KernSmooth) 
x <- c(5.84155992364115, 1.55292112974119, 0.0349665318792623, 3.93053647398094,
       3.42790577684633, 2.9715553006801, 0.837108410045353, 2.872476865277, 
       3.89232548092257, 0.206399650539628) 
y <- c(0.141415317472329, 1.34799648955049, 0.0297566221758204, 
       -0.966736679061812, 0.246306732122746, 0.557982376254723, 
       0.740542828791083, 0.162336127802977, -0.428804158514744, 
       0.691280978689863) 

locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]

I get

[1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947  0.4441603
[7]  0.1425592 -0.3600028 -0.7840411 -1.0517612 -1.2690134        NaN

On another computer, I get the same, except I get -0.7270521 instead of NaN. I am guessing that most of you will also get that. So the question is how do I fix my broken system? Does this have to do with my LAPACK or LIBBLAS?

Note that both computers mentioned above use Ubuntu. The one that gave NaN uses Ubuntu 13.10, the one that gave a number is on 12.04.

EDIT:

My new suspicion is that it is a floating point calculation issue: A local polynomial regression is just a weighted linear regression, where the weights decrease the further the point is away from the point of evaluation, in this case 5.84. One should note that the bandwidth is small so a first thought is that there are no points within the bandwidth. However, locpoly uses a Gaussian kernel, so that all points have strictly positive weight. My guess is that the weights are so small though that rounding or floating point calculation can be a problem. I'm not sure how to fix that.

Was it helpful?

Solution 2

If I am using Windows 7 and R 3.0, I get:

 > locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]
 [1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947
 [6]  0.4441603  0.1425592 -0.3600028 -0.7840411 -1.0517612
[11] -1.2690134 -2.8078788

So your issue wasn't there. However if I use R 3.0 on Ubuntu 13.04 (GNU/Linux 3.8.0-23-generic x86_64) I get:

 > locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]

 [1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947  0.4441603
 [7]  0.1425592 -0.3600028 -0.7840411 -1.0517612 -1.2690134        NaN

I tried experimenting and was able to get numbers very similar to what I got in Windows 7 by using:

> locpoly(round(x,3), round(y,3), bandwidth = 0.4821232, gridsize = 12, degree = 1)[['y']]

 [1]  0.3032295  0.6459197  0.9533132  1.1121400  0.8118960  0.4437407
 [7]  0.1422658 -0.3604210 -0.7848982 -1.0531299 -1.2710219 -0.7269588

So I hope that is able to solve your second problem.

In order to figure out why I was able to get non-NaN answer with Windows, but not Ubuntu, we can look at http://cran.r-project.org/web/packages/KernSmooth/index.html and notice that:

MacOS X binary: KernSmooth_2.23-10.tgz Windows binary: KernSmooth_2.23-11.zip

Naturally there are two different versions, but the Windows binary is one version further than MacOS X binary. I checked out the sourcecode for the functions in Ubuntu and Windows and they look to be the same. However, I did find this Rounding differences on Windows vs Unix based system in sprintf showing that there is a reported bug for differences in rounding between unix and windows. Although that was asked 3 years ago. So I would say the difference might be OS or version for KernSmooth (would lean toward OS as others have also encountered that issue)

OTHER TIPS

Not an answer, but wanted to post a graph. I'm still not clear on what you expected to get from locpoly, but here it is.

Rgames> foo<-locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1)
Rgames> foo
$x
 [1] 0.03496653 0.56283866 1.09071078 1.61858291 2.14645504 2.67432716
 [7] 3.20219929 3.73007142 4.25794354 4.78581567 5.31368780 5.84155992

$y
 [1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947  0.4441603
 [7]  0.1425592 -0.3600028 -0.7840411 -1.0517612 -1.2690134        NaN

enter image description here My suspicion is that last point on the far right diverges for the fitting parameters in use, and it was dumb luck that you got a non-NaN value under any OS.

I'm on Windows 7, R 3.0.1.

It does seem to be a floating point issue, but because of max(x): changing the first entry in x(which happens to be it's max) from 5.84155992364115 to 5.841559923 your NaN becomes Inf, and to 5.84155992 your NaNbecomes -0.7261049.

Also setting the option truncate to FALSE changes the ouput considerably:

locpoly(x, y, bandwidth = 0.4821232, gridsize = 12, degree = 1, truncate=F)[['y']]
[1]  0.3030137  0.6456624  0.9530586  1.1121106  0.8120947  0.4441603  0.1425592 -0.3600028 -0.7449278 -0.3872891 -0.1235228  0.1414153

which I wouldn't have anticipated since you didn't specify range.x.

You're asking for a local polynomial of degree 1 (Requires 2 points to fit, minimum) and there is only one point local to 5.84155992364115. The real question is, why didn't it give you a nice error telling you to up the bandwidth. Nudge it up to 0.5 and it all works.

I will like to put it differently,

I am not regular user of ubuntu, but know NaN(Not a Number) which was started by Java!

First I will say update Lapack And make sure all files are installed correctly (Recent Bug)

if some file is missing and the number is not processed well.

Divide by Zero (or Invalid result due to missing library) can cause NAN in result.

I don't think ubuntu has any problem with this as this.

Please specify version of LAPACK from better understanding.(including Ubuntu is 32 or 64 bit and LAPACK is 32 or 64bit)

I hope this will help.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top