Why do the correlation coefficients differ?

https://stackoverflow.com/questions/22588932

r
linear-regression
correlation
lm

19-06-2023
|

Question

Why aren't the correlation coefficients as given by the command

cor(t,g)

and as given by the command

summary(tgmodel, correlation=TRUE)

the same after running:

t<-c(0,1.2,2.3,3,4,5.2,6.3,7,8)
g<-c(12,10,8,11,6,7,2,3,3)
tgmodel<-lm(g~t)

Solution

They differ because they're correlations between different things:

cor() shows the correlation between the input variables, t and g.
summary(lm(...), correlation=TRUE) shows the correlation between the estimated parameters, i.e. the slope and the intercept.

If you carefully examine the output of summary(), you'd notice that it shows the square of the correlation coefficient between t and g as Multiple R-squared:

> summary(lm(g~t))

...
Multiple R-squared: 0.8357, Adjusted R-squared: 0.8122 
...

> cor(t,g)**2
[1] 0.8356938

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow