Question

I'm trying to use python to define a variable as a string, specifically, a path to a file. I then want python to pass that string to an R variable. I then want to use R's read.table function to write the contents of that file to the variable in R as a table. I'm using rpy2 and r.assign to accomplish this, but I'm getting no where. Any help would be appreciated! The error message I receive is pasted below the code.

import os
import sys
from rpy2.robjects import r
import rpy2.robjects as robjects
from rpy2.robjects import *

r = robjects.r

known_genes = str(raw_input('Path to file containing gene coordinates? '))
anno_genes = str(raw_input('Path to gene:ilmn ID mapping file? '))
ms_meta = str(raw_input('Path to GWAS MS Meta Data file? '))
SNP_ID = str(raw_input('SNP Identifier? '))
SNP_dir = str(raw_input('SNP results directory? '))


r.assign('known.genes', known_genes)
r.assign('anno.genes', anno_genes)
r.assign('ms.meta', ms_meta)
r.assign('SNP', SNP_ID)
r.assign('SNP_dir', SNP_dir)

knowngenes = r('read.table("known.genes", header=T, as.is=T)')
annogenes = r('read.table("anno.genes", header=T, as.is=T)')



Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'known.genes': No such file or directory
Traceback (most recent call last):
  File "plot.py", line 24, in <module>
    knowngenes = r('read.table("known.genes", header=T, as.is=T)')
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/rpy2-2.3.8-py2.7-macosx-10.6-intel.egg/rpy2/robjects/__init__.py", line 240, in __call__
    res = self.eval(p)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/rpy2-2.3.8-py2.7-macosx-10.6-intel.egg/rpy2/robjects/functions.py", line 86, in __call__
    return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/rpy2-2.3.8-py2.7-macosx-10.6-intel.egg/rpy2/robjects/functions.py", line 35, in __call__
    res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in file(file, "rt") : cannot open the connection

RESOLVED:

knowngenes = r('read.table("known.genes", header=T, as.is=T)')

should simply be

knowngenes = r('read.table(known.genes, header=T, as.is=T)')

Python was interpreting the "" as a string (even though R would interpret them as a variable). As a result, Python was passing the string "known.genes" to the r function, as opposed to the "path to file" stored in known.genes.

Was it helpful?

Solution

knowngenes = r('read.table("known.genes", header=T, as.is=T)')

should simply be

knowngenes = r('read.table(known.genes, header=T, as.is=T)')

Python was interpreting the "" as a string (even though R would interpret them as a variable). As a result, Python was passing the string "known.genes" to the r function, as opposed to the "path to file" stored in known.genes.

OTHER TIPS

The RRuntimeError exception indicates an error happening when running R, and the message here tells that it cannot open a connection (file)

There is probable a confusion between variable names and content of variables. When writing

knowngenes = r('read.table("known.genes", header=T, as.is=T)')

it is strictly equivalent to writing in R

knowngenes = read.table("known.genes", header=T, as.is=T) 

and the code you have before that tells that the name of the file is in a variable called known.genes.

I'd suggest to rewrite code like this (and minimize the number of objects you are storing in the R global environment):

from rpy2.robjects.packages import importr
utils = importr('utils')

mydataframe = utils.read_table(myfilename, header=True, as_is=True)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top