Domanda

I am trying to get the Hessian matrix from my own data, and I have two results -

  • using the code Hessian from library(numDeriv)
  • using code numericHessian from library(maxLik)

The result from the Hessian is very very small relative to the result from the numericHessian.

In this case, which results should I trust?

Specifically, the data I used ranged from 350000 to 1100000 and they were 9X2 matrix with a total of 18 data values.

I used with a sort of standard deviation formula and the result from "numericHessian" was ranging from 230 to 466 with 2X2 matrix, whereas the result from "Hessian" ranged from -3.42e-18 to 1.34e-17 which was much less than the previous one.

Which one do you think is correct calculation for the sort of standard deviation?

The code is as follows:

 data=read.table("C:/file.txt", header=T);
 data <- as.matrix(data);
 library(plyr)
 library(MASS)
 w1 = tail(data/(rowSums(data)),1)
 w2 = t(w1)
 f <- function(x){
 w1 = tail(x/(rowSums(x)),1)
 w2 = t(w1)
 r = ((w1%*%cov(cbind(x))%*%w2)^(1/2))
 return(r)
 }
 library(maxLik); 
 numericHessian(f, t0=rbind(data[1,1], data[1,2]))
 library(numDeriv);
 hessian(f, rbind(data[1,1], data[1,2]), method="Richardson")

The file.txt is the following:

  1                  2

 137                201

 122                342

 142                111

 171                126

 134                123

 823                876

 634                135

 541                214

 423                142

The result from the "numericHessian" is:

          [,1]        [,2]
 [1,] 0.007105427 0.007105427
 [2,] 0.007105427 0.000000000

Then, the result from the "Hessian" is:

          [,1]          [,2]
 [1,] -3.217880e-15 -1.957243e-16
 [2,] -1.957243e-16  1.334057e-16

Thank you very much in advance.

È stato utile?

Soluzione

You have not given a reproducible example, but I'll try anyway.

library(bbmle)
 x <- 0:10
 y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8)
 d <- data.frame(x,y)
 LL <- function(ymax=15, xhalf=6)
     -sum(stats::dpois(y, lambda=ymax/(1+x/xhalf), log=TRUE))
 fit <- mle2(LL)
 cc <- coef(fit)

Here are the finite-difference estimates of the Hessians (matrices of second derivatives) of the negative log-likelihood function at the MLE: inverting these matrices gives an estimate of the variance-covariance matrices of the parameters.

 library(numDeriv)
 hessian(LL,cc)
 ##               [,1]          [,2]
 ## [1,]  1.296717e-01 -1.185789e-15
 ## [2,] -1.185789e-15  4.922087e+00

 library(maxLik)
 numericHessian(LL, t0=cc)
 ##           [,1]     [,2]
 ## [1,] 0.1278977 0.000000
 ## [2,] 0.0000000 4.916956

So for this relatively trivial example, numDeriv::hessian and maxLik::numericHessian give very similar results. So there must be something you haven't shown us, or something special about the numerics of your problem. In order to proceed further, we need a reproducible example please ...

dat <- matrix(c(137,201,122,342,142,111,
                171,126,134,123,823,876,
                634,135,541,214,423,142),
        byrow=TRUE,ncol=2)
f <- function(x){
    w1 <- tail(x/(rowSums(x)),1)
    sqrt(w1%*%cov(cbind(x))%*%t(w1))
}
p <- t(dat[1,1:2,drop=FALSE])
f(p)  ## 45.25483
numDeriv::hessian(f,p)
##              [,1]          [,2]
## [1,] -3.217880e-15 -1.957243e-16
## [2,] -1.957243e-16  1.334057e-16
maxLik::numericHessian(f,t0=p)
##            [,1]        [,2]
## [1,] 0.007105427 0.007105427
## [2,] 0.007105427 0.000000000

OK, these clearly disagree. I'm not sure why, but in this particular case we can analyze what you're doing and see which one is right:

  • since your input matrix is a single column, x/rowSums(x) is a vector of ones, so the last element (w1 <- tail(...,1)) is just 1.
  • so your expression reduces to sqrt(cov(cbind(x))). Again, since x is a one-column matrix, cov() is just the variance, and sqrt(cov(.)) is just the standard deviation, or the norm of the vector.
  • the variance is a quadratic function of any element's deviation from the mean, and so the standard deviation is more or less linear in the deviation from the mean (except at zero), so we would expect the second derivatives to be zero. So it looks like numDeriv::hessian is giving the right answer

We can also confirm this by increasing eps for numericHessian:

maxLik::numericHessian(f,t0=p,eps=1e-3)
##      [,1]          [,2]
## [1,]    0  0.000000e+00
## [2,]    0 -7.105427e-09

The bottom line is that numDeriv uses a more accurate (but slower) method, but you can get reasonable answers from numericHessian if you're careful.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top