I have an lm object and want to get the formula extracted with coefficients. This object includes categorical variables like month, as well as interactions with these categorical variables and numeric ones.

Another user helped with some code that works for all but the categorical variables, however when I add a categorical variable (eg. d here) it breaks down and gives the error "Error in parse(text = x) : :1:785: unexpected numeric constant":

a = c(1, 2, 5, 13, 40, 29, 82, 22, 34, 54, 12, 31, 21, 29, 31, 42)
b = c(12, 15, 20, 12, 34, 56, 12, 12, 15, 20, 12, 34, 56, 12, 32, 41)
c = c(20, 30, 40, 18, 72, 34, 12, 40, 18, 72, 28, 65, 21, 32, 42, 52)
d = structure(c(8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 
8L, 1L, 9L, 7L), .Label = c("April", "August", "December", 
"February", "January", "July", "June", "March", "May", "November", 
"October", "September"), class = "factor")


model = lm(a~b+c+factor(d))

as.formula(
  paste0("y ~ ", round(coefficients(model)[1],2), " + ", 
    paste(sprintf("%.2f * %s", 
                  coefficients(model)[-1],  
                  names(coefficients(model)[-1])), 
          collapse=" + ")
  )
)

What I get from above is "Error in parse(text = x) : :1:53: unexpected symbol 1: y ~ -7 + 14.23 * b + -6.82 * c + -529.30 * factor(d)August

When I'd like is to get the full formula, with each of the months multiplied by a coefficient (or in this case only 3 of them, in my actual dataset I have much more data and all months happen at least 8 times). But it stalls here, in this example with 'unexpected symbol' and in my actual data with "Error in parse(text = x) : :1:785: unexpected numeric constant" and without even trying to do a month like it does here (not sure why the difference between the example and actual code).

My formulas are quite large, so it needs to be able to scale up (which the current code does).

有帮助吗?

解决方案

What you are creating is not a valid formula in R, therefore don't try and coerce the results of sprintf into a formula.

Therefore something like

sprintf(' y ~ %.2f + %s', coef(model)[1], 
   paste(sprintf('(%.2f) * %s',
          coef(model)[-1], names(coef(model)[-1]) ), collapse ='+'))

其他提示

In your model you have 5 explanatory variables and only 3 data points. See summary(model).

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top