The best model, according to both AIC and BIC, contains only a non significant term

Question 1

You are fitting the model using method = "REML" which is the restricted likelihood. It does not always follow that the maximized likelihood under REML will be close to the likelihood under unrestricted ML. Set method = "ML" and see if that fixes the "problem" with AIC/BIC.

Question 2

There are at least two things going on here. First, it is not meaningful to assert that the model with the lowest AIC is the "best" model. For a set of models with different AIC, the relative probability that the i^th model is better than the model with the lowest AIC is given by: (see here, and references cited therein):

L = exp[ ( AIC_min - AIC_i ) / 2 ]

So, comparing m5 and m6:

L = exp[ (125.1006 - 127.0635) / 2 ] = 0.374

Or, there is a 37% chance that m5 is in fact the better model. The point is that there is not a significant difference between an AIC of 125.2 and an AIC of 127, so you cannot say the m6 is "best". Both models do about equally well.

So why is cent_Latitude insignificant in m6? A p-value > 0.05 means that the effect of cent_Latitude on the response is small compared to the error in the response. This could happen because the true effect size is 0, or it could happen because the effect size, combined with the range in cent_latitude results in an effect on the response which is small compared to the error in the response. You can see this in the following, which uses made-up data and creates the same effect yor are seeing with your real data.

Suppose the the response variable actually depends on both cent_Latitude and PCptail. In m6, variability in response due to the effect of PCptail is counted towards the "error" in the model, reducing the calculated significance of cent_Latitude. On the other hand, in m5 the variability in response due to the effect of cent_Latitude is counted toward the error, reducing the significance of PCptail. If the effect sizes, compared to the true error, are just right you can get this effect, as shown below. Note that this is one of the reasons it is not recommended to compare non-nested models using a single statistic (like AIC, or RSQ, or even F).

library(nlme)
set.seed(1)
# for simplicity, use un-correlated predictors
c <- data.frame(PCptail=sample(seq(0,10,.1),length(seq)),
                cent_Latitude=sample(seq(0,1,.01),length(seq)))
# response depends on both predictors
c$Area <- 1 + .01*c$PCptail +.1*c$cent_Latitude + rnorm(nrow(c),0,1)
m6 <- gls(Area~cent_Latitude,c)
m5 <- gls(Area~PCptail,c)
summary(m6)
# Generalized least squares fit by REML
#   Model: Area ~ cent_Latitude 
#   Data: c 
#        AIC      BIC    logLik
#   288.5311 296.3165 -141.2656
# 
# Coefficients:
#                    Value Std.Error   t-value p-value
# (Intercept)    1.1835936 0.1924341  6.150645  0.0000
# cent_Latitude -0.1882202 0.3324754 -0.566118  0.5726
# ...
summary(m5)
# Generalized least squares fit by REML
#   Model: Area ~ PCptail 
#   Data: c 
#        AIC      BIC    logLik
#   289.2713 297.0566 -141.6356
# 
# Coefficients:
#                 Value  Std.Error  t-value p-value
# (Intercept) 0.7524261 0.18871413 3.987121  0.0001
# PCptail     0.0674115 0.03260484 2.067530  0.0413
# ...

So how to deal with this? Well, have you looked at the residuals plots for all these models? Have you looked at the Q-Q plots? Leverage plots? Generally, all other thins being equal, I would pick the model with the more significant parameters, assuming the residuals were random and normally distributed, and the none of the data points had unusually high leverage.