Question

My code in R has the following trivial assignment:

 df$a<-factor(df$a,levels=c("3","2","1")) 

(the order of levels is important in plotting, so this probably must be done by an assignment like this)

How could I achieve the same result using rpy2? Let's say I have a DataFrame constructed like this:

from rpy2 import robjects

d = {'a': robjects.IntVector((1,2,3)), 'b': robjects.IntVector((4,5,6))}
dataf = robjects.DataFrame(d)

Now I would like to change the type of column 'a' and set the order of levels in it, like I have done in R. Is it possible using rpy?

Was it helpful?

Solution

To fix the levels in an R factor:

>>> from rpy2.robjects.vectors import FactorVector, IntVector
>>> v = FactorVector((1,2,3), levels=IntVector((3,2,1)))
>>> print(v)
[1] 1 2 3
Levels: 3 2 1

Changing a column in a DataFrame can be done with:

>>> dataf[dataf.index('a')] = v

Note: In R you are happily giving numerical values (integers) while specifying levels as strings. R does let you do so silently, but be aware that the internal representation of R "factor" vector is integer, and that mix might lead to unpleasant surprises.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top