Selecting a specific row from an rpy2 DataFrame

https://stackoverflow.com/questions/4355783

rpy2

08-10-2019
|

Question

My data frame is survey data that I have got from a .csv file. One of the columns is age and I am looking to remove all respondents under 18 years of age. I'll then need to isolate age groups (18-24, 25-35, etc) into their own dataframes that I can do frequency distributions for.

The R code is simple enough:

x.sub <- subset(x.df, y > 2)

But I can't figure out how to use the r() function to get my dataframe variable from python into an R statement. It feels as though there ought to be a .subset() function in the rpy2 DataFrame class. But if it exists, I can't find it.

Solution

Using rpy2 2.2.0-dev (should be the same with 2.1.x)

from rpy2.robjects.vectors import DataFrame
dataf = DataFrame.from_csvfile("my/file.csv")

dataf_subset = dataf.rx(dataf.rx2("age").ro >= 18, True)

That one exact example is not in the documentation (and may be should be there), but it's constituting elements are:extracting elements and R operators on vectors

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow