How to compile a function that computes the Hessian?
-
05-03-2021 - |
Question
I am looking to see how a function that computes the Hessian of a log-likelihood can be compiled, so that it can be efficiently used with different sets of parameters.
Here is an example.
Suppose we have a function that computes the log-likelihood of a logit model, where y is vector and x is a matrix. beta is a vector of parameters.
pLike[y_, x_, beta_] :=
Module[
{xbeta, logDen},
xbeta = x.beta;
logDen = Log[1.0 + Exp[xbeta]];
Total[y*xbeta - logDen]
]
Given the following data, we can use it as follows
In[1]:= beta = {0.5, -1.0, 1.0};
In[2]:= xmat =
Table[Flatten[{1,
RandomVariate[NormalDistribution[0.0, 1.0], {2}]}], {500}];
In[3]:= xbeta = xmat.beta;
In[4]:= prob = Exp[xbeta]/(1.0 + Exp[xbeta]);
In[5]:= y = Map[RandomVariate[BernoulliDistribution[#]] &, prob] ;
In[6]:= Tally[y]
Out[6]= {{1, 313}, {0, 187}}
In[9]:= pLike[y, xmat, beta]
Out[9]= -272.721
We can write its hessian as follows
hessian[y_, x_, z_] :=
Module[{},
D[pLike[y, x, z], {z, 2}]
]
In[10]:= z = {z1, z2, z3}
Out[10]= {z1, z2, z3}
In[11]:= AbsoluteTiming[hess = hessian[y, xmat, z];]
Out[11]= {0.1248040, Null}
In[12]:= AbsoluteTiming[
Table[hess /. {z1 -> 0.0, z2 -> -0.5, z3 -> 0.8}, {100}];]
Out[12]= {14.3524600, Null}
For efficiency reasons, I can compile the original likelihood function as follows
pLikeC = Compile[{{y, _Real, 1}, {x, _Real, 2}, {beta, _Real, 1}},
Module[
{xbeta, logDen},
xbeta = x.beta;
logDen = Log[1.0 + Exp[xbeta]];
Total[y*xbeta - logDen]
],
CompilationTarget -> "C", Parallelization -> True,
RuntimeAttributes -> {Listable}
];
which yields the same answer as pLike
In[10]:= pLikeC[y, xmat, beta]
Out[10]= -272.721
I am looking for an easy way to obtain similarly, a compiled version of the hessian function, given my interest in evaluating it many times.
Solution
Here is an idea based on the previous post(s): We construct the input to Compile symbolically.
mkCHessian[{y_, ys_Integer}, {x_, xs_Integer}, {beta_, bs_Integer}] :=
With[{
args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}],
yi = Quiet[Part[y, #] & /@ Range[ys]],
xi = Quiet[Table[Part[x, i, j], {i, xs}, {j, xs}]],
betai = Quiet[Part[beta, #] & /@ Range[bs]]
},
Print[args];
Print[yi];
Print[xi];
Print[betai];
Compile[Evaluate[args],
Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
]
And then generate the compiled function.
cf = mkCHessian[{y, 3}, {x, 3}, {beta, 3}];
You then call that compiled function
cf[y, xmat, beta]
Please verify that I did not make a mistake; in de Vries's post y is of length 2. Mine is length 3. I am sure what is correct. And of course, the Print are for illustration...
Update
A version with slightly improved dimension handling and with variables localized:
ClearAll[mkCHessian];
mkCHessian[ys_Integer, bs_Integer] :=
Module[
{beta, x, y, args, xi, yi, betai},
args = MapThread[{#1, _Real, #2} &, {{y, x, beta}, {1, 2, 1}}];
yi = Quiet[Part[y, #] & /@ Range[ys]];
xi = Quiet[Table[Part[x, i, j], {i, ys}, {j, bs}]];
betai = Quiet[Part[beta, #] & /@ Range[bs]];
Compile[Evaluate[args], Evaluate[D[pLike[yi, xi, betai], {betai, 2}]]]
]
Now, with asim's definitions in In[1] to In[5]:
cf = mkCHessian[500, 3];
cf[y, xmat, beta]
(* ==> {{-8.852446923, -1.003365612, 1.66653381},
{-1.003365612, -5.799363241, -1.277665283},
{1.66653381, -1.277665283, -7.676551252}} *)
Since y is a random vector results will vary.
OTHER TIPS
Leonid already beat me, but I'll post my line of thought anyway just for laughs.
The main problem here is that compilation works for numerical functions whereas D
needs symbolics. So the trick would be to first define the pLike function with the same amount of variables as needed for the particular size of matrices you intend to use, e.g,
pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}]
The Hessian:
D[pLike[{y1, y2}, {{x1, x2, x3}, {x12, x22, x32}}, {z1, z2, z3}], {{z1, z2, z3}, 2}]
This function should be compilable as it depends on numerical quantities only.
To generalize for various vectors one could build something like this:
Block[{ny = 2, nx = 3, z1, z2, z3},
hessian[
Table[ToExpression["y" <> ToString[i] <> "_"], {i, ny}],
Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j] <> "_"],
{i, ny}, {j, nx}], {z1_, z2_, z3_}
] =
D[
pLike[
Table[ToExpression["y" <> ToString[i]], {i, ny}],
Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]],
{i, ny}, {j, nx}], {z1, z2, z3}
],
{{z1, z2, z3}, 2}
]
]
And this can of course be easily generalized for variable nx and ny.
And now for the Compile
part. It's an ugly piece of code, consisting of the above and a compile and made suitable for variable y size. I like ruebenko's code more than mine.
ClearAll[hessianCompiled];
Block[{z1, z2, z3},
hessianCompiled[yd_] :=
(hessian[
Table[ToExpression["y" <> ToString[i] <> "_"], {i, yd}],
Table[ToExpression["xr" <> ToString[i]<>"c"<>ToString[j] <>"_"],{i,yd},{j,3}],
{z1_, z2_, z3_}
] =
D[
pLike[
Table[ToExpression["y" <> ToString[i]], {i, yd}],
Table[ToExpression["xr" <> ToString[i] <> "c" <> ToString[j]], {i,yd},{j,3}],
{z1, z2, z3}
], {{z1, z2, z3}, 2}
];
Compile[{{y, _Real, 1}, {x, _Real, 2}, {z, _Real, 1}},
hessian[Table[y[[i]], {i, yd}], Table[x[[i, j]], {i, yd}, {j, 3}],
Table[z[[i]], {i, 3}]]]// Evaluate] // Quiet
)
]
hessianCompiled[500][y, xmat, beta] // Timing
{1.497, {{-90.19295669, -15.80180276, 6.448357845},
{-15.80180276, -80.41058154, -26.33982586},
{6.448357845, -26.33982586, -72.92978931}}}
ruebenko's version (including my edits):
(cf = mkCHessian[500, 3]; cf[y, xmat, beta]) // Timing
{1.029, {{-90.19295669, -15.80180276, 6.448357845},
{-15.80180276, -80.41058154, -26.33982586},
{6.448357845, -26.33982586, -72.92978931}}}
Note that both tests include compilation time. Running the calculation on its own:
h = hessianCompiled[500];
Do[h[y, xmat, beta], {100}]; // Timing
Do[cf[y, xmat, beta], {100}]; // Timing
(* timing for 100 hessians:
==> {0.063, Null}
==> {0.062, Null}
*)