rpy2 object not found error

Question 1

When calling

r('lmout <- lm(r_df$a ~ r_df$b)')

the embedded R will look for a variable r_df, and no such variable is made visible to R in your code example.

When doing

r_df = com.convert_to_r_dataframe(datframe)

you are creating the variable r_df on the Python side but while the actual data in now in R, there is no symbol (name) associated with it known to R. That data structure remains anonymous. (btw, you may want to use the automagic conversion of pandas data frames shipping with rpy2-2.3.3).

To create a variable name in R's "global environment", add this:

from rpy2.robjects import globalenv
globalenv['r_df'] = r_df

Now your lm() call should work.

Question 2

try this, (not sure which header do the magic, though....)

import rpy2.robjects as robjects
from rpy2.robjects import DataFrame, Formula
import rpy2.robjects.numpy2ri as npr
import numpy as np
from rpy2.robjects.packages import importr


def my_linear_fit_using_r(X,Y,verbose=True):
   # ## FITTINGS:   RPy implementation ###
   r_correlation = robjects.r('function(x,y) cor.test(x,y)')
   # r_quadfit = robjects.r('function(x,y) lm(y~I(x)+I(x^2))')
   r_linfit = robjects.r('function(x,y) lm(y~x)')
   r_get_r2=robjects.r('function(x) summary(x)$r.squared')
   lin=r_linfit(robjects.FloatVector(X),robjects.FloatVector(Y))
   coef_lin=robjects.r.coef(lin)
   a=coef_lin[0]
   b=coef_lin[1]
   r2=r_get_r2(lin)
   ci=robjects.r.confint(lin) # confidence intervals
   lwr_a=ci[0]
   lwr_b=ci[1]
   upr_a=ci[2]
   upr_b=ci[3]
   if verbose:
      print robjects.r.summary(lin)
      # print robjects.r.summary(quad)
   return (a,b,r2[0],lwr_a,upr_a,lwr_b,upr_b)

Question 3

Just a remark, for simple regressions you can do it completely in Python, use ols from statsmodels:

from statsmodels.formula.api import ols

lmout = ols('a ~ b', datframe).fit()
lmout.summary()