R: Equivalent ways of coding a formula of a lm with higher-order terms

https://stackoverflow.com/questions/18361504

26-06-2022
|

Question

To my knowledge, there are three possible ways to code for second-order (and higher-) terms in a formula.

We can use the function I(..), the function poly(..) and we can construct ourself the variable of the second degree. My question is: How do these functions work?

set.seed(23)
A = rnorm(12)
B = 1:12
C = factor(rep(c(1,2,3),4))
B2=B^2

what is the equivalent of lm(A~poly(B,2)*C) when using I(..) or when using the variable B2?

The use of raw=T in the poly(..) function does not change anything to the results, correct?

Solution

lm(A~B2*C)

lm(A~I(B^2)*C)

give you the result of squaring column B and then doing the regression. Using

poly(B,2)

does something completely different - see ?poly.

Edit to add: poly() calculates orthogonal polynomials which are not the same as the standard polynomials derived from simply squaring, cubing etc. a number.

OTHER TIPS

Does it mean that poly(B,2,raw=T) is equivalent to I(B^2) or to B+I(B^2)?

Try:

x = 0:99
df = data.frame(x=x,y=rnorm(100)+0.1*x + 0.04*x*x)
lm(y~poly(x,2),data=df)
lm(y~poly(x,2,raw=TRUE),data=df)
lm(y~x+I(x^2),data=df)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow