Which way has better accuracy to compute the matrix matrix vector product, A B u?

https://stackoverflow.com/questions/7550632

29-01-2021
|

Question

I want to compute the vector,

s = A B u,

where s and u are N-dimensional complex vector, A is a N-by-M complex matrix, B is M-by-N complex matrix. Which of the following two ways has better accuracy (more significant digits) when the elements of A, B, and u are represented as floating-point numbers?

(1) Computing B u first.

First do the matrix-vector multiplication,

y = B u

Then, another matrix-vector multiplication

s = A y

(2) Computing A B first.

First do the matrix-matrix multiplication,

C = A B

Then, matrix-vector multiplication

s = C u

Is there any known general rule?

By the way, I understand that the method (1) is much more efficient than method (2).

Solution

Matrix-vector multiplication has better numerical stability properties than matrix-matrix multiplication, so I would expect method (1) to be the more accurate one.

In more detail, matrix-vector multiplication has nice forward and backward error bounds. If we take for instance the matrix-vector multiplication y = B u, then the error in y is bounded by 2n times the unit round-off (1e-16 when using standard double precision numbers) times the largest number in the matrix B times the largest number in the vector u. This is the forward error bound.

The backward error bound is that the computed y is not precisely the product of B and u, but it is precisely the product of a slightly different matrix B' and the vector u. The difference between B and B' is bounded by 2n times the unit round-off times the largest number in the matrix B.

For matrix-matrix multiplication, there is a forward error bound similar to the one for matrix-vector multiplication, but there is no nice backward error bound.

This is a general principle: a computation with fewer outputs (such as matrix-vector multiplication) is more likely to be backward stable than a computation with more outputs (such as matrix-matrix computation).

However, whether this makes any difference is another matrix. It may be that method (2) recovers backward stability because of the matrix-vector product which follows the matrix-matrix product. It may also be that for your particular application, there is not much difference, or even that method (2) is in fact more accurate.

But, given that method (1) is certainly the faster one, and also possibly the more accurate one, I would definitely go for that option.

Added 29 September 2011: My favorite source on this topic is Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, 2002. But many textbooks on numerical analysis have a discussion about forward and backward error analysis, especially those books that concentrate on linear algebra.

The forward error is fairly intuitive. If you know that B and u is correct, then what you are interested in is the difference between the product B u as computed and the exact product; this is what the forward error analysis tells you. Backward error comes into play when the matrix B is not correct (it may be the result from earlier computations that commit an error, or it comes ultimately from measurements that suffer from experimental or modelling errors). Suppose the error in B is 1e-10 and the backward error in the multiplication is smaller than this, say 1e-11. This means that although the result of the multiplication is not correct for the B that you gave to the algorithm, it is correct for another matrix B which is so close to the original B that it is just as likely to be the correct B as the B you gave to the algorithm. So in some sense this is as good as you can hope for.

Forward and backward error analysis have different strengths: sometimes one applies, sometimes the other, sometimes a mixture. Ideally, an algorithm should have good forward and backward error bounds, but this does not happen very often.

OTHER TIPS

Except in cases where an algorithm is specifically designed to do extra work to compensate for numerical inaccuracy, an excellent rule of thumb is that given two ways to compute the same thing, the algorithm that does less work has better accuracy (after all, there are fewer opportunities to incur rounding). This is not universally true, so it doesn't remove the obligation to think about these things, but it's a good starting point.

In your case, it happens to be exactly correct. Without knowing anything a priori about the specific values in your matrices, method (1) should be preferred. (It is possible to construct specific cases in which method (2) would be more accurate, but they are generally highly contrived).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow