質問

I am trying to create a linear mixed model (lmm) that allows for a spatial correlation between points (have lat/long for each point). I would like the spatial correlation to be based upon the great circular distance between points.

The package ramps includes a correlation structure that computes the ‘haversine’ distance – although I am having trouble implementing it. I have previously used other correlation structures (corGaus, corExp) and not had any difficulties. I am assuming the corRGaus with the 'haversine' metric can be implemented in the same way.

I am able to successfully create an lmm with spatial correlation calculated on a planar distance using the lme function.

I am also able to create a linear model (not mixed) with spatial correlation calculated using great circular distance although there are errors with the correlation structure using the gls command.

When trying to the use the gls command for a linear model with the great circular distance I have the following errors:

x = runif(20, 1,50)
y = runif(20, 1,50)
gls(x ~ y, cor = corRGaus(form = ~ x + y))

Generalized least squares fit by REML
 Model: x ~ y 
 Data: NULL 
Log-restricted-likelihood: -78.44925

Coefficients:
 (Intercept)            y 
24.762656602  0.007822469 

Correlation Structure: corRGaus
 Formula: ~x + y 
 Parameter estimate(s):
Error in attr(object, "fixed") && unconstrained : 
 invalid 'x' type in 'x && y'

When I increase the size of the data there are memory allocation errors (still a very small dataset):

x = runif(100, 1, 50)
y = runif(100, 1, 50)
lat = runif(100, -90, 90)
long = runif(100, -180, 180)
gls(x ~ y, cor = corRGaus(form = ~ x + y))

Error in glsEstimate(glsSt, control = glsEstControl) : 
'Calloc' could not allocate memory (18446744073709551616 of 8 bytes)

When trying to run a mixed model using the lme command and the corRGaus from the ramps package the following results:

x = runif(100, 1, 50)
y = runif(100, 1, 50)
LC = c(rep(1, 50) , rep(2, 50))
lat = runif(100, -90, 90)
long = runif(100, -180, 180)

lme(x ~ y,random = ~ y|LC, cor = corRGaus(form = ~ long + lat))

Error in `coef<-.corSpatial`(`*tmp*`, value = value[parMap[, i]]) : 
  NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning messages:
1: In nlminb(c(coef(lmeSt)), function(lmePars) -logLik(lmeSt, lmePars),  :
  NA/NaN function evaluation
2: In nlminb(c(coef(lmeSt)), function(lmePars) -logLik(lmeSt, lmePars),  :
  NA/NaN function evaluation

I am unsure about how to proceed with this method. The "haversine" function is what I want to use to complete my models, but I am having trouble implementing them. There are very few questions anywhere about the ramps package, and I have seen very few implementations. Any helps would be greatly appreciated.

I have previously attempted to modify the nlme package and was unable to do so. I posted a question about this, where I was recommended to use the ramps package.

I am using R 3.0.0 on a Windows 8 computer.

役に立ちましたか?

解決

OK, here is an option that implements various spatial correlation structures in gls/nlme with haversine distance.

The various corSpatial-type classes already have machinery in place to construct a correlation matrix from spatial covariates, given a distance metric. Unfortunately, dist does not implement haversine distance, and dist is the function called by corSpatial to compute a distance matrix from the spatial covariates.

The distance matrix computations are performed in getCovariate.corSpatial. A modified form of this method will pass the appropriate distance to other methods, and the majority of methods will not need to be modified.

Here, I create a new corStruct class, corHaversine, and modify only getCovariate and one other method (Dim) that determines which correlation function is used. Those methods which do not need modification, are copied from equivalent corSpatial methods. The (new) mimic argument in corHaversine takes the name of the spatial class with the correlation function of interest: by default, it is set to "corSpher".

Caveat: beyond ensuring that this code runs for spherical and Gaussian correlation functions, I haven't really done a lot of checking.

#### corHaversine - spatial correlation with haversine distance

# Calculates the geodesic distance between two points specified by radian latitude/longitude using Haversine formula.
# output in km
haversine <- function(x0, x1, y0, y1) {
    a <- sin( (y1 - y0)/2 )^2 + cos(y0) * cos(y1) * sin( (x1 - x0)/2 )^2
    v <- 2 * asin( min(1, sqrt(a) ) )
    6371 * v
}

# function to compute geodesic haversine distance given two-column matrix of longitude/latitude
# input is assumed in form decimal degrees if radians = F
# note fields::rdist.earth is more efficient
haversineDist <- function(xy, radians = F) {
    if (ncol(xy) > 2) stop("Input must have two columns (longitude and latitude)")
    if (radians == F) xy <- xy * pi/180
    hMat <- matrix(NA, ncol = nrow(xy), nrow = nrow(xy))
    for (i in 1:nrow(xy) ) {
        for (j in i:nrow(xy) ) {
            hMat[j,i] <- haversine(xy[i,1], xy[j,1], xy[i,2], xy[j,2]) 
            }
        }
    as.dist(hMat)
}

## for most methods, machinery from corSpatial will work without modification
Initialize.corHaversine <- nlme:::Initialize.corSpatial
recalc.corHaversine <- nlme:::recalc.corSpatial
Variogram.corHaversine <- nlme:::Variogram.corSpatial
corFactor.corHaversine <- nlme:::corFactor.corSpatial
corMatrix.corHaversine <- nlme:::corMatrix.corSpatial
coef.corHaversine <- nlme:::coef.corSpatial
"coef<-.corHaversine" <- nlme:::"coef<-.corSpatial"

## Constructor for the corHaversine class
corHaversine <- function(value = numeric(0), form = ~ 1, mimic = "corSpher", nugget = FALSE, fixed = FALSE) {
    spClass <- "corHaversine"
    attr(value, "formula") <- form
    attr(value, "nugget") <- nugget
    attr(value, "fixed") <- fixed
    attr(value, "function") <- mimic
    class(value) <- c(spClass, "corStruct")
    value
}   # end corHaversine class
environment(corHaversine) <- asNamespace("nlme")

Dim.corHaversine <- function(object, groups, ...) {
    if (missing(groups)) return(attr(object, "Dim"))
    val <- Dim.corStruct(object, groups)
    val[["start"]] <- c(0, cumsum(val[["len"]] * (val[["len"]] - 1)/2)[-val[["M"]]])
    ## will use third component of Dim list for spClass
    names(val)[3] <- "spClass"
    val[[3]] <- match(attr(object, "function"), c("corSpher", "corExp", "corGaus", "corLin", "corRatio"), 0)
    val
}
environment(Dim.corHaversine) <- asNamespace("nlme")


## getCovariate method for corHaversine class
getCovariate.corHaversine <- function(object, form = formula(object), data) {
    if (is.null(covar <- attr(object, "covariate"))) {          # if object lacks covariate attribute
        if (missing(data)) {                                    # if object lacks data
            stop("need data to calculate covariate")
            }
        covForm <- getCovariateFormula(form)
        if (length(all.vars(covForm)) > 0) {                    # if covariate present
            if (attr(terms(covForm), "intercept") == 1) {       # if formula includes intercept
                covForm <- eval(parse(text = paste("~", deparse(covForm[[2]]),"-1",sep="")))    # remove intercept
                }
            # can only take covariates with correct names
            if (length(all.vars(covForm)) > 2) stop("corHaversine can only take two covariates, 'lon' and 'lat'")
            if ( !all(all.vars(covForm) %in% c("lon", "lat")) ) stop("covariates must be named 'lon' and 'lat'")
            covar <- as.data.frame(unclass(model.matrix(covForm, model.frame(covForm, data, drop.unused.levels = TRUE) ) ) )
            covar <- covar[,order(colnames(covar), decreasing = T)] # order as lon ... lat
            }
        else {
            covar <- NULL
            }

        if (!is.null(getGroupsFormula(form))) {                 # if groups in formula extract covar by groups
            grps <- getGroups(object, data = data)
            if (is.null(covar)) {
                covar <- lapply(split(grps, grps), function(x) as.vector(dist(1:length(x) ) ) )     # filler?
                } 
            else {
                giveDist <- function(el) {
                    el <- as.matrix(el)
                    if (nrow(el) > 1) as.vector(haversineDist(el))                       
                    else numeric(0)
                    }
                covar <- lapply(split(covar, grps), giveDist )
                }
            covar <- covar[sapply(covar, length) > 0]  # no 1-obs groups
            } 
        else {                                  # if no groups in formula extract distance
            if (is.null(covar)) {
                covar <- as.vector(dist(1:nrow(data) ) )
                } 
            else {
                covar <- as.vector(haversineDist(as.matrix(covar) ) )
                }
            }
        if (any(unlist(covar) == 0)) {          # check that no distances are zero
            stop("cannot have zero distances in \"corHaversine\"")
            }
        }
    covar
    }   # end method getCovariate
environment(getCovariate.corHaversine) <- asNamespace("nlme")

To test that this runs, given range parameter of 1000:

## test that corHaversine runs with spherical correlation (not testing that it WORKS ...)
library(MASS)
set.seed(1001)
sample_data <- data.frame(lon = -121:-22, lat = -50:49)
ran <- 1000 # 'range' parameter for spherical correlation
dist_matrix <- as.matrix(haversineDist(sample_data))    # haversine distance matrix
# set up correlation matrix of response
corr_matrix <- 1-1.5*(dist_matrix/ran)+0.5*(dist_matrix/ran)^3
corr_matrix[dist_matrix > ran] = 0
diag(corr_matrix) <- 1
# set up covariance matrix of response
sigma <- 2  # residual standard deviation
cov_matrix <- (diag(100)*sigma) %*% corr_matrix %*% (diag(100)*sigma)   # correlated response
# generate response
sample_data$y <- mvrnorm(1, mu = rep(0, 100), Sigma = cov_matrix)

# fit model
gls_haversine <- gls(y ~ 1, correlation = corHaversine(form=~lon+lat, mimic="corSpher"), data = sample_data)
summary(gls_haversine)

# Correlation Structure: corHaversine
# Formula: ~lon + lat 
# Parameter estimate(s):
#    range 
# 1426.818 
#
# Coefficients:
#                 Value Std.Error  t-value p-value
# (Intercept) 0.9397666 0.7471089 1.257871  0.2114
#
# Standardized residuals:
#        Min         Q1        Med         Q3        Max 
# -2.1467696 -0.4140958  0.1376988  0.5484481  1.9240042 
#
# Residual standard error: 2.735971 
# Degrees of freedom: 100 total; 99 residual

Testing that it runs with Gaussian correlation, with range parameter = 100:

## test that corHaversine runs with Gaussian correlation
ran = 100  # parameter for Gaussian correlation
corr_matrix_gauss <- exp(-(dist_matrix/ran)^2)
diag(corr_matrix_gauss) <- 1
# set up covariance matrix of response
cov_matrix_gauss <- (diag(100)*sigma) %*% corr_matrix_gauss %*% (diag(100)*sigma)   # correlated response
# generate response
sample_data$y_gauss <- mvrnorm(1, mu = rep(0, 100), Sigma = cov_matrix_gauss)

# fit model
gls_haversine_gauss <- gls(y_gauss ~ 1, correlation = corHaversine(form=~lon+lat, mimic = "corGaus"), data = sample_data)
summary(gls_haversine_gauss)

With lme:

## runs with lme
# set up data with group effects
group_y <- as.vector(sapply(1:5, function(.) mvrnorm(1, mu = rep(0, 100), Sigma = cov_matrix_gauss)))
group_effect <- rep(-2:2, each = 100)
group_y = group_y + group_effect
group_name <- factor(group_effect)
lme_dat <- data.frame(y = group_y, group = group_name, lon = sample_data$lon, lat = sample_data$lat)
# fit model
lme_haversine <- lme(y ~ 1, random = ~ 1|group, correlation = corHaversine(form=~lon+lat, mimic = "corGaus"), data = lme_dat, control=lmeControl(opt = "optim") )
summary(lme_haversine)

# Correlation Structure: corHaversine
#  Formula: ~lon + lat | group 
#  Parameter estimate(s):
#    range 
# 106.3482 
# Fixed effects: y ~ 1 
#                  Value Std.Error  DF     t-value p-value
# (Intercept) -0.0161861 0.6861328 495 -0.02359033  0.9812
#
# Standardized Within-Group Residuals:
#        Min         Q1        Med         Q3        Max 
# -3.0393708 -0.6469423  0.0348155  0.7132133  2.5921573 
#
# Number of Observations: 500
# Number of Groups: 5 

他のヒント

See if this answer on R-Help is useful: http://markmail.org/search/?q=list%3Aorg.r-project.r-help+winsemius+haversine#query:list%3Aorg.r-project.r-help%20winsemius%20haversine+page:1+mid:ugecbw3jjwphu2pb+state:results

I just checked and and doesn't appear that the ramps or nlme packages have been modified to incorporate those changes suggested by Malcolm Fairbrother, so you will need to do some hacking. I don't want to be considered for the bounty since I am not posting a tested solution and I didn't dream it up either.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top