Question

I'm trying to call the R function loess via Rpy2 in Python on this datafile: http://filebin.ca/azuz9Piv0z8/test.data

It works when I use a subset of the data (the first 1000 points) but when I try to use the entire file, I get an error. My code:

import pandas
from rpy2.robjects import r
import rpy2.robjects as robjects
data = pandas.read_table(os.path.expanduser("~/test2.data"), sep="\t").values
small_data = data[0:1000, :]
print "small data loess:"
a, b = robjects.FloatVector(list(small_data[:, 0])), \
       robjects.FloatVector(list(small_data[:, 1]))
df = robjects.DataFrame({"a": a, "b": b})
loess_fit = r.loess("b ~ a", data=df)
print loess_fit

print "large data loess:"
a, b = robjects.FloatVector(list(data[:, 0])), \
       robjects.FloatVector(list(data[:, 1]))
df = robjects.DataFrame({"a": a, "b": b})
loess_fit = r.loess("b ~ a", data=df)
print loess_fit

Fitting on small_data works but not data. I get the error:

Error in simpleLoess(y, x, w, span, degree, parametric, drop.square, normalize,  : 
  NA/NaN/Inf in foreign function call (arg 1)
    loess_fit = r.loess("b ~ a", data=df)
  File "/usr/local/lib/python2.7/dist-packages/rpy2-2.3.3-py2.7-linux-x86_64.egg/rpy2/robjects/functions.py", line 86, in __call__
    return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rpy2-2.3.3-py2.7-linux-x86_64.egg/rpy2/robjects/functions.py", line 35, in __call__
    res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in simpleLoess(y, x, w, span, degree, parametric, drop.square, normalize,  : 
  NA/NaN/Inf in foreign function call (arg 1)

How can this be fixed? I'm not sure if it's a problem with the R function loess or with the Rpy2 interface to it? thanks.

Was it helpful?

Solution

The problem are -Inf values in your data:

DF <- read.table('http://filebin.ca/azuz9Piv0z8/test.data')
DF[!is.finite(DF[,1]) | !is.finite(DF[,2]),]
#        V1   V2
# 5952 -Inf -Inf

OTHER TIPS

Why call R, when you can use the statsmodels package in Python for lowess smoothing?

There's also a Bio.Statistics package for lowess, but it doesn't appear to be as accurate, and I couldn't get it converge for this lowess example.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top