Since I'm not the author of predict.gls
I can't answer precisely why it was written this way, but I'm hesitant to suggest that this is a bug in a function/package that have been around this long.
Anyway...the reason it works with lm
is that when predict.lm
calls model.frame
:
m <- model.frame(Terms, newdata, na.action = na.action,
xlev = object$xlevels)
it is using the xlev
argument and the information on the levels from the fitted model object itself. I don't believe that gls
object stores this information at all.
In predict.gls
there are two things going on. First, the initial call to model.frame
:
mfArgs <- list(formula = form, data = newdata, na.action = na.action)
mfArgs$drop.unused.levels <- TRUE
dataMod <- do.call("model.frame", mfArgs)
Note that here we're not using the xlev
argument, and in fact we're explicitly saying to drop unused levels. If you create your own version of predict.gls
and comment our the drop.unused.levels
line, it should work, as long as you don't call factor
in your formula!
Why?
Because later on we see this:
X <- model.matrix(form, dataMod)
where your formula is being re-applied. Which means that factor
is being called on columns with levels that don't exist, forcing them to be dropped.
So when I use a version of predict.gls
that comments out the drop.unused.levels
line, and I set cyl
to be a factor up front in the data frame, it generates predictions for me just fine.
But I would suggest asking the package authors as to whether this is intended behavior or not. It seems odd to me, but like I said, for a package this old and well used it seems strange that something like this would be un-intended.