Question

Id like to know how is it possible that the best model (m6), based on AIC and BIC values, can have a non significant term whereas the second best model(m5) has a signficant term.

I have the following competing models list:

m1=gls(Area_Km2~size+cent_Latitude+PCptail+pwingbilltar,corMartins(1,phy=ctree),data = c)
m2=gls(Area_Km2~size+cent_Latitude+PCptail,corMartins(1,phy=ctree),data = c)
m3=gls(Area_Km2~size+cent_Latitude,corMartins(1,phy=ctree),data = c)
m4=gls(Area_Km2~size,corMartins(1,phy=ctree),data = c)
m5=gls(Area_Km2~PCptail,corMartins(1,phy=ctree),data = c)
m6=gls(Area_Km2~cent_Latitude,corMartins(1,phy=ctree),data = c)
m7=gls(Area_Km2~pwingbilltar,corMartins(1,phy=ctree),data = c)

Here is model comparison

   Model df      AIC      BIC    logLik   Test  L.Ratio p-value
m1     1  7 147.2775 157.9620 -66.63873                        
m2     2  6 139.4866 148.8187 -63.74331 1 vs 2 5.790838  0.0161
m3     3  5 133.3334 141.2510 -61.66672 2 vs 3 4.153191  0.0416
m4     4  4 130.7749 137.2186 -61.38746 3 vs 4 0.558517  0.4549
m5     5  4 127.0635 133.5072 -59.53175                        
m6     6  4 125.1006 131.5443 -58.55029                        
m7     7  4 132.4542 138.8979 -62.22711                        

here goes m6

 Generalized least squares fit by REML
  Model: Area_Km2 ~ cent_Latitude 
  Data: c 
       AIC      BIC    logLik
   125.1006 131.5442 -58.55029

Correlation Structure: corMartins
 Formula: ~1  
 Parameter estimate(s):
alpha 
    1 

Coefficients:
               Value Std.Error    t-value p-value
(Intercept)    0.4940905 0.1730082  2.8558795  0.0070
cent_Latitude -0.1592109 0.1726268 -0.9222837  0.3624

 Correlation: 
          (Intr)
cent_Latitude -0.158

Standardized residuals:
   Min         Q1        Med         Q3        Max 
-1.3270048 -0.7088524 -0.2828898  0.4672255  2.2203523 

Residual standard error: 1.066911 
Degrees of freedom: 39 total; 37 residual

here goes m5

Generalized least squares fit by REML
  Model: Area_Km2 ~ PCptail 
  Data: c 
       AIC      BIC    logLik
  127.0635 133.5072 -59.53175

Correlation Structure: corMartins
 Formula: ~1 
 Parameter estimate(s):
alpha 
    1 

Coefficients:
             Value  Std.Error   t-value p-value
(Intercept) 0.19752329 0.20158500 0.9798512  0.3335
PCptail     0.01925621 0.00851536 2.2613499  0.0297

 Correlation: 
        (Intr)
PCptail -0.595

Standardized residuals:
       Min         Q1        Med         Q3        Max 
-1.3416127 -0.6677304 -0.2467510  0.3198370  2.3339127 

Residual standard error: 1.01147 
Degrees of freedom: 39 total; 37 residual
Was it helpful?

Solution 2

You are fitting the model using method = "REML" which is the restricted likelihood. It does not always follow that the maximized likelihood under REML will be close to the likelihood under unrestricted ML. Set method = "ML" and see if that fixes the "problem" with AIC/BIC.

OTHER TIPS

There are at least two things going on here. First, it is not meaningful to assert that the model with the lowest AIC is the "best" model. For a set of models with different AIC, the relative probability that the ith model is better than the model with the lowest AIC is given by: (see here, and references cited therein):

L = exp[ ( AICmin - AICi ) / 2 ]

So, comparing m5 and m6:

L = exp[ (125.1006 - 127.0635) / 2 ] = 0.374

Or, there is a 37% chance that m5 is in fact the better model. The point is that there is not a significant difference between an AIC of 125.2 and an AIC of 127, so you cannot say the m6 is "best". Both models do about equally well.

So why is cent_Latitude insignificant in m6? A p-value > 0.05 means that the effect of cent_Latitude on the response is small compared to the error in the response. This could happen because the true effect size is 0, or it could happen because the effect size, combined with the range in cent_latitude results in an effect on the response which is small compared to the error in the response. You can see this in the following, which uses made-up data and creates the same effect yor are seeing with your real data.

Suppose the the response variable actually depends on both cent_Latitude and PCptail. In m6, variability in response due to the effect of PCptail is counted towards the "error" in the model, reducing the calculated significance of cent_Latitude. On the other hand, in m5 the variability in response due to the effect of cent_Latitude is counted toward the error, reducing the significance of PCptail. If the effect sizes, compared to the true error, are just right you can get this effect, as shown below. Note that this is one of the reasons it is not recommended to compare non-nested models using a single statistic (like AIC, or RSQ, or even F).

library(nlme)
set.seed(1)
# for simplicity, use un-correlated predictors
c <- data.frame(PCptail=sample(seq(0,10,.1),length(seq)),
                cent_Latitude=sample(seq(0,1,.01),length(seq)))
# response depends on both predictors
c$Area <- 1 + .01*c$PCptail +.1*c$cent_Latitude + rnorm(nrow(c),0,1)
m6 <- gls(Area~cent_Latitude,c)
m5 <- gls(Area~PCptail,c)
summary(m6)
# Generalized least squares fit by REML
#   Model: Area ~ cent_Latitude 
#   Data: c 
#        AIC      BIC    logLik
#   288.5311 296.3165 -141.2656
# 
# Coefficients:
#                    Value Std.Error   t-value p-value
# (Intercept)    1.1835936 0.1924341  6.150645  0.0000
# cent_Latitude -0.1882202 0.3324754 -0.566118  0.5726
# ...
summary(m5)
# Generalized least squares fit by REML
#   Model: Area ~ PCptail 
#   Data: c 
#        AIC      BIC    logLik
#   289.2713 297.0566 -141.6356
# 
# Coefficients:
#                 Value  Std.Error  t-value p-value
# (Intercept) 0.7524261 0.18871413 3.987121  0.0001
# PCptail     0.0674115 0.03260484 2.067530  0.0413
# ...

So how to deal with this? Well, have you looked at the residuals plots for all these models? Have you looked at the Q-Q plots? Leverage plots? Generally, all other thins being equal, I would pick the model with the more significant parameters, assuming the residuals were random and normally distributed, and the none of the data points had unusually high leverage.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top