Invalid Characters causing error in rlm()

Question

Your issue boils down to the fact you are using non-syntatic variable names.

These should be used with caution, and without expectation that package authors will be able to anticipate any issues that may arise.

To quote from the help for formula

Variable names can be quoted by backticks like this in formulae, although there is no guarantee that all code using formulae will accept such non-syntactic names.

The issue in how xvars is created rlm.formula

xvars <- as.character(attr(mt, "variables"))[-1L]

and then the use later on

xlev <- if (length(xvars) > 0L) {
        xlev <- lapply(mf[xvars], levels)
        xlev[!sapply(xlev, is.null)]
    }

Which, as you show, does not work

This will create quoted back-ticked variables for non-syntatic names. If they are already backticked, then they will create double back-ticked names

i.e. if the column name was "x1^2", the element in xvar becomes "`x1^2`".

This fails with [.data.frame for example

x <- data.frame(`a` = 1)
> x[,'`a`']

Error in `[.data.frame`(x, , "`a`") : undefined columns selected

Because the column name is 'a' not `a`

If you backtick the column name

i.e. if the column name was "`x1^2`", the element in xvar becomes "``x1^2``".

which again is not a column in your data.frame

The reason lm works is that it does not attempt this definition and use of xvars, instead it uses model.matrix to define the design matrix x directly to pass to lm.fit

If you want to fit the model y ~ x1 + x2 + x1:x2 +x1^2 + y1^2 then you can using

rlm(y ~ x1*x2 + I(x1^2) + I(x2^2)

In this case you only need three columns in your data.frame (or objects in your evaluation environment) y, x1 and x2. as the I() function allows to perform arithmetic operations on a variable, as I is parsed as a symbol by terms.formula