Question

I need to find the quadratic equation term of a graph I have plotted in R. When I do this in excel, the term appears in a text box on the chart but I'm unsure how to move this to a cell for subsequent use (to apply to values requiring calibrating) or indeed how to ask for it in R. If it is summonable in R, is it saveable as an object to do future calculations with?

This seems like it should be a straightforward request in R, but I can't find any similar questions. Many thanks in advance for any help anyone can provide on this.

Was it helpful?

Solution

All the answers provide aspects of what you appear at want to do, but non thus far brings it all together. Lets consider Tom Liptrot's answer example:

fit <- lm(speed ~ dist + I(dist^2), cars)

This gives us a fitted linear model with a quadratic in the variable dist. We extract the model coefficients using the coef() extractor function:

> coef(fit)
 (Intercept)         dist    I(dist^2) 
 5.143960960  0.327454437 -0.001528367

So your fitted equation (subject to rounding because of printing is):

\hat{speed} = 5.143960960 + (0.327454437 * dist) + (-0.001528367 * dist^2)

(where \hat{speed} is the fitted values of the response, speed).

If you want to apply this fitted equation to some data, then we can write our own function to do it:

myfun <- function(newdist, model) {
    coefs <- coef(model)
    res <- coefs[1] + (coefs[2] * newdist) + (coefs[3] * newdist^2)
    return(res)
}

We can apply this function like this:

> myfun(c(21,3,4,5,78,34,23,54), fit)
[1] 11.346494  6.112569  6.429325  6.743024 21.386822 14.510619 11.866907
[8] 18.369782

for some new values of distance (dist), Which is what you appear to want to do from the Q. However, in R we don't do things like this normally, because, why should the user have to know how to form fitted or predicted values from all the different types of model that can be fitted in R?

In R, we use standard methods and extractor functions. In this case, if you want to apply the "equation", that Excel displays, to all your data to get the fitted values of this regression, in R we would use the fitted() function:

> fitted(fit)
        1         2         3         4         5         6         7         8 
 5.792756  8.265669  6.429325 11.608229  9.991970  8.265669 10.542950 12.624600 
        9        10        11        12        13        14        15        16 
14.510619 10.268988 13.114445  9.428763 11.081703 12.122528 13.114445 12.624600 
       17        18        19        20        21        22        23        24 
14.510619 14.510619 16.972840 12.624600 14.951557 19.289106 21.558767 11.081703 
       25        26        27        28        29        30        31        32 
12.624600 18.369782 14.057455 15.796751 14.057455 15.796751 17.695765 16.201008 
       33        34        35        36        37        38        39        40 
18.688450 21.202650 21.865976 14.951557 16.972840 20.343693 14.057455 17.340416 
       41        42        43        44        45        46        47        48 
18.038887 18.688450 19.840853 20.098387 18.369782 20.576773 22.333670 22.378377 
       49        50 
22.430008 21.93513

If you want to apply your model equation to some new data values not used to fit the model, then we need to get predictions from the model. This is done using the predict() function. Using the distances I plugged into myfun above, this is how we'd do it in a more R-centric fashion:

> newDists <- data.frame(dist = c(21,3,4,5,78,34,23,54))
> newDists
  dist
1   21
2    3
3    4
4    5
5   78
6   34
7   23
8   54
> predict(fit, newdata = newDists)
        1         2         3         4         5         6         7         8 
11.346494  6.112569  6.429325  6.743024 21.386822 14.510619 11.866907 18.369782

First up we create a new data frame with a component named "dist", containing the new distances we want to get predictions for from our model. It is important to note that we include in this data frame a variable that has the same name as the variable used when we created our fitted model. This new data frame must contain all the variables used to fit the model, but in this case we only have one variable, dist. Note also that we don't need to include anything about dist^2. R will handle that for us.

Then we use the predict() function, giving it our fitted model and providing the new data frame just created as argument 'newdata', giving us our new predicted values, which match the ones we did by hand earlier.

Something I glossed over is that predict() and fitted() are really a whole group of functions. There are versions for lm() models, for glm() models etc. They are known as generic functions, with methods (versions if you like) for several different types of object. You the user generally only need to remember to use fitted() or predict() etc whilst R takes care of using the correct method for the type of fitted model you provide it. Here are some of the methods available in base R for the fitted() generic function:

> methods(fitted)
[1] fitted.default*       fitted.isoreg*        fitted.nls*          
[4] fitted.smooth.spline*

   Non-visible functions are asterisked

You will possibly get more than this depending on what other packages you have loaded. The * just means you can't refer to those functions directly, you have to use fitted() and R works out which of those to use. Note there isn't a method for lm() objects. This type of object doesn't need a special method and thus the default method will get used and is suitable.

OTHER TIPS

You can add a quadratic term in the forumla in lm to get the fit you are after. You need to use an I()around the term you want to square as in the example below:

plot(speed ~ dist, cars)

fit1 = lm(speed ~ dist, cars) #fits a linear model
abline(fit1) #puts line on plot
fit2 = lm(speed ~ I(dist^2) + dist, cars) #fits a model with a quadratic term
fit2line = predict(fit2, data.frame(dist = -10:130))
lines(-10:130 ,fit2line, col=2) #puts line on plot

To get the coefficients from this use:

coef(fit2)

I dont think it is possible in Excel, as they only provide functions to get coefficients for a linear regression (SLOPE, INTERCEPT, LINEST) or for a exponential one (GROWTH, LOGEST), though you may have more luck by using Visual Basic.

As for R you can extract model coefficients using the coef function:

mdl <- lm(y ~ poly(x,2,raw=T))
coef(mdl) # all coefficients
coef(mdl)[3] # only the 2nd order coefficient

I guess you mean that you plot X vs Y values in Excel or R, and in Excel use the "Add trendline" functionality. In R, you can use the lm function to fit a linear function to your data, and this also gives you the "r squared" term (see examples in the linked page).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top