Question

I'd like to use pandas in combination with R, so I did:

import pandas as pd
import rpy2.robjects as robjects

>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]},index=["one", "two"])
>>> robjects.r.cor(df.A, df.B)
    ValueError: Nothing can be done for the type <class 'pandas.core.series.Series'> at the moment.

Does this mean I cannot yet use pandas' objects with rpy2?

I then tried:

import pandas.rpy.common as com

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]},index=["one", "two"])
rdf = com.convert_to_r_dataframe(df)

But how would I do the above with rdf? For instance, rdf['A'] gives me a TypeError

Was it helpful?

Solution

There is initial support to make have a seamless use of pandas and R/rpy.

You'll need to do:

from rpy2.robjects import pandas2ri
pandas2ri.activate()

The documentation is a little behind, and the support is not complete, but there is small example to show where this is heading:

https://plus.google.com/116424798545383828852/posts/jPfZ8VcTVi3

OTHER TIPS

Why don't you use pandas?

import pandas as pd
from scipy import stats

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df.corr('pearson')
=> A  B
A  1  1
B  1  1
stats.f_oneway(df['A'], df['B'])
=> (13.5, 0.021311641128756723)

I know this doesn't answer your question exactly, but sometimes running into issues like this indicates that you are not using the tools as they are intended.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top