I'm always hesitant to claim there are no situations in which something involving R environments and scoping might bite, but ... after some more exploration, my first usage above does look safe.
It turns out that the printed call is a bit of red herring.
The formula that actually gets used by other functions (and the one extracted by formula()
and as.formula()
) is the one stored in the terms
element of the fit object, and it gets the actual formula right. (The terms
element contains an object of class "terms"
, which is just a "formula"
with a bunch of attached attributes.)
To see that all of the proposals in my question and the associated comments store the same "formula"
object (up to the associated environment), run the following.
## First the three approaches in my post
formula(fun(XX=c("cyl", "disp")))
# mpg ~ cyl + disp
# <environment: 0x026d2b7c>
formula(lm(mpg ~ cyl + disp, data=mtcars))
# mpg ~ cyl + disp
formula(fun2(XX=c("cyl", "disp"))$call)
# mpg ~ cyl + disp
# <environment: 0x02c4ce2c>
## Then Gabor Grothendieck's idea
XX = c("cyl", "disp")
ff <- reformulate(response="mpg", termlabels=XX)
formula(do.call("lm", list(ff, quote(mtcars))))
## mpg ~ cyl + disp
To confirm that formula()
really is deriving its output from the terms
element of the fit object, have a look at stats:::formula.lm
and stats:::formula.terms
.