Question

I am looking to see how a function that computes the Hessian of a log-likelihood can be compiled, so that it can be efficiently used with different sets of parameters.

Here is an example.

Suppose we have a function that computes the log-likelihood of a logit model, where y is vector and x is a matrix. beta is a vector of parameters.

 pLike[y_, x_, beta_] :=
 Module[
  {xbeta, logDen},
  xbeta = x.beta;
  logDen = Log[1.0 + Exp[xbeta]];
  Total[y*xbeta - logDen]
  ]

Given the following data, we can use it as follows

In[1]:= beta = {0.5, -1.0, 1.0};

In[2]:= xmat = 
  Table[Flatten[{1, 
     RandomVariate[NormalDistribution[0.0, 1.0], {2}]}], {500}];

In[3]:= xbeta = xmat.beta;

In[4]:= prob = Exp[xbeta]/(1.0 + Exp[xbeta]);

In[5]:= y = Map[RandomVariate[BernoulliDistribution[#]] &, prob] ;

In[6]:= Tally[y]

Out[6]= {{1, 313}, {0, 187}}

In[9]:= pLike[y, xmat, beta]

Out[9]= -272.721

We can write its hessian as follows

 hessian[y_, x_, z_] :=
  Module[{},
   D[pLike[y, x, z], {z, 2}]
  ]


In[10]:= z = {z1, z2, z3}

Out[10]= {z1, z2, z3}

In[11]:= AbsoluteTiming[hess = hessian[y, xmat, z];]

Out[11]= {0.1248040, Null}

In[12]:= AbsoluteTiming[
 Table[hess /. {z1 -> 0.0, z2 -> -0.5, z3 -> 0.8}, {100}];]

Out[12]= {14.3524600, Null}

For efficiency reasons, I can compile the original likelihood function as follows

pLikeC = Compile[{{y, _Real, 1}, {x, _Real, 2}, {beta, _Real, 1}},
   Module[
    {xbeta, logDen},
    xbeta = x.beta;
    logDen = Log[1.0 + Exp[xbeta]];
    Total[y*xbeta - logDen]
    ],
   CompilationTarget -> "C", Parallelization -> True,  
   RuntimeAttributes -> {Listable}
   ];

which yields the same answer as pLike

In[10]:= pLikeC[y, xmat, beta]

Out[10]= -272.721

I am looking for an easy way to obtain similarly, a compiled version of the hessian function, given my interest in evaluating it many times.

Was it helpful?

Solution

Here is an idea based on the previous post(s): We construct the input to Compile symbolically.

mkCHessian[{y_, ys_Integer}, {x_, xs_Integer}, {beta_, bs_Integer}] :=
  With[{
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}],
   yi = Quiet[Part[y, #] & /@ Range[ys]],
   xi = Quiet[Table[Part[x, i, j], {i, xs}, {j, xs}]],
   betai = Quiet[Part[beta, #] & /@ Range[bs]]
   },
  Print[args];
  Print[yi];
  Print[xi];
  Print[betai];
  Compile[Evaluate[args], 
   Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
  ]

And then generate the compiled function.

cf = mkCHessian[{y, 3}, {x, 3}, {beta, 3}];

You then call that compiled function

cf[y, xmat, beta]

Please verify that I did not make a mistake; in de Vries's post y is of length 2. Mine is length 3. I am sure what is correct. And of course, the Print are for illustration...


Update
A version with slightly improved dimension handling and with variables localized:

ClearAll[mkCHessian];
mkCHessian[ys_Integer, bs_Integer] :=
 Module[
   {beta, x, y, args, xi, yi, betai},
   args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}];
   yi = Quiet[Part[y, #] & /@ Range[ys]];
   xi = Quiet[Table[Part[x, i, j], {i, ys}, {j, bs}]];
   betai = Quiet[Part[beta, #] & /@ Range[bs]];
   Compile[Evaluate[args], Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
 ]

Now, with asim's definitions in In[1] to In[5]:

cf = mkCHessian[500, 3];
cf[y, xmat, beta]

(* ==> {{-8.852446923, -1.003365612, 1.66653381}, 
       {-1.003365612, -5.799363241, -1.277665283},
       {1.66653381, -1.277665283, -7.676551252}}  *)

Since y is a random vector results will vary.

OTHER TIPS

Leonid already beat me, but I'll post my line of thought anyway just for laughs.

The main problem here is that compilation works for numerical functions whereas D needs symbolics. So the trick would be to first define the pLike function with the same amount of variables as needed for the particular size of matrices you intend to use, e.g,

pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}]

enter image description here

The Hessian:

D[pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}], {{z1, z2, z3}, 2}]

enter image description here

This function should be compilable as it depends on numerical quantities only.

To generalize for various vectors one could build something like this:

Block[{ny = 2, nx = 3, z1, z2, z3},
   hessian[
      Table[ToExpression["y" <> ToString[i] <> "_"], {i, ny}], 
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j] <> "_"], 
           {i, ny}, {j, nx}], {z1_, z2_, z3_}
   ] =
   D[
     pLike[
        Table[ToExpression["y" <> ToString[i]], {i, ny}], 
        Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], 
             {i, ny}, {j, nx}], {z1, z2, z3}
        ], 
     {{z1, z2, z3}, 2}
   ]
 ]

enter image description here

And this can of course be easily generalized for variable nx and ny.


And now for the Compile part. It's an ugly piece of code, consisting of the above and a compile and made suitable for variable y size. I like ruebenko's code more than mine.

ClearAll[hessianCompiled];
Block[{z1, z2, z3},
 hessianCompiled[yd_] :=
  (hessian[
     Table[ToExpression["y" <> ToString[i] <> "_"], {i, yd}], 
     Table[ToExpression["xr" <> ToString[i]<>"c"<>ToString[j] <>"_"],{i,yd},{j,3}],
     {z1_, z2_, z3_}
     ] =
    D[
     pLike[
      Table[ToExpression["y" <> ToString[i]], {i, yd}],
      Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], {i,yd},{j,3}],
      {z1, z2, z3}
     ], {{z1, z2, z3}, 2}
    ];
   Compile[{{y, _Real, 1}, {x, _Real, 2}, {z, _Real, 1}}, 
    hessian[Table[y[[i]], {i, yd}], Table[x[[i, j]], {i, yd}, {j, 3}],
      Table[z[[i]], {i, 3}]]]// Evaluate] // Quiet
   )
 ]

hessianCompiled[500][y, xmat, beta] // Timing 

{1.497, {{-90.19295669, -15.80180276, 6.448357845}, 
        {-15.80180276, -80.41058154, -26.33982586},
        {6.448357845, -26.33982586, -72.92978931}}}

ruebenko's version (including my edits):

(cf = mkCHessian[500, 3]; cf[y, xmat, beta]) // Timing

{1.029, {{-90.19295669, -15.80180276, 6.448357845}, 
         {-15.80180276, -80.41058154, -26.33982586}, 
         {6.448357845, -26.33982586, -72.92978931}}}

Note that both tests include compilation time. Running the calculation on its own:

h = hessianCompiled[500];
Do[h[y, xmat, beta], {100}]; // Timing
Do[cf[y, xmat, beta], {100}]; // Timing

(* timing for 100 hessians: 

   ==> {0.063, Null}

   ==> {0.062, Null}
*)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top