I believe you can save a lot of computation time by explicitly doing an LU factorization, [l, u, p, q] = lu(X'*X);
and use those factors when doing the calculations. Also, since X
are constant for about 100 models, pre-calculating X'*X
will most likely save you some time.
Note that in your case, the most time demanding operation might very well be the sqrt
-function.
% Constant for every 100 models or so:
M = X'*X;
[l, u, p, q] = lu(M);
% Now, I guess this should be quite a bit faster (I might have messed up the order):
nwse = sqrt(diag(N.*(q * ( u \ (l \ (p * Q))) * q * (u \ (l \ p)))));
The first two terms are commonly used:
l
the lower triangular matrix
u
the upper triangular matrix
Now, p
and q
are a bit more uncommon.
p
is a row permutation matrix used to obtain numerical stability. There is not much performance gain in doing [l, u, p] = lu(M)
compared to [l, u] = lu(M)
for sparse matrices.
q
however offers a significant performance gain. q
is a column permutation matrix that is used to reduce the amount of fill when doing the factorization.
Note that the [l, u, p, q] = lu(M)
is only a valid syntax for sparse matrices.
As for why using full pivoting as described above should be faster:
Try the following to see the purpose of the column permutation matrix q
. It is easier to work with elements that are aligned around the diagonal.
S = sprand(100,100,0.01);
[l, u, p] = lu(S);
spy(l)
figure
spy(u)
Now, compare it with this:
[ll, uu, pp, qq] = lu(S);
spy(ll);
figure
spy(uu);
Unfortunately, I don't have MATLAB here right now, so I can't guarantee that I put all the arguments in the correct order, but I think it's correct.