Question

I have an lm object and want to get the formula extracted with coefficients. This object includes categorical variables like month, as well as interactions with these categorical variables and numeric ones.

Another user helped with some code that works for all but the categorical variables, however when I add a categorical variable (eg. d here) it breaks down and gives the error "Error in parse(text = x) : :1:785: unexpected numeric constant":

a = c(1, 2, 5, 13, 40, 29, 82, 22, 34, 54, 12, 31, 21, 29, 31, 42)
b = c(12, 15, 20, 12, 34, 56, 12, 12, 15, 20, 12, 34, 56, 12, 32, 41)
c = c(20, 30, 40, 18, 72, 34, 12, 40, 18, 72, 28, 65, 21, 32, 42, 52)
d = structure(c(8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 
8L, 1L, 9L, 7L), .Label = c("April", "August", "December", 
"February", "January", "July", "June", "March", "May", "November", 
"October", "September"), class = "factor")


model = lm(a~b+c+factor(d))

as.formula(
  paste0("y ~ ", round(coefficients(model)[1],2), " + ", 
    paste(sprintf("%.2f * %s", 
                  coefficients(model)[-1],  
                  names(coefficients(model)[-1])), 
          collapse=" + ")
  )
)

What I get from above is "Error in parse(text = x) : :1:53: unexpected symbol 1: y ~ -7 + 14.23 * b + -6.82 * c + -529.30 * factor(d)August

When I'd like is to get the full formula, with each of the months multiplied by a coefficient (or in this case only 3 of them, in my actual dataset I have much more data and all months happen at least 8 times). But it stalls here, in this example with 'unexpected symbol' and in my actual data with "Error in parse(text = x) : :1:785: unexpected numeric constant" and without even trying to do a month like it does here (not sure why the difference between the example and actual code).

My formulas are quite large, so it needs to be able to scale up (which the current code does).

Was it helpful?

Solution

What you are creating is not a valid formula in R, therefore don't try and coerce the results of sprintf into a formula.

Therefore something like

sprintf(' y ~ %.2f + %s', coef(model)[1], 
   paste(sprintf('(%.2f) * %s',
          coef(model)[-1], names(coef(model)[-1]) ), collapse ='+'))

OTHER TIPS

In your model you have 5 explanatory variables and only 3 data points. See summary(model).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top