How do units flow through matrix operations?

https://stackoverflow.com/questions/11975658

26-06-2021
|

Question

It seems that there are many useful applications for matrix math where not all entries in a given matrix share the same units. I want to look into type systems that could track these units and ensure we don't make mistakes (similar to a number of libraries and languages that already do dimension checking for scalar arithmetic). I'll give an example of what I am talking about, and then I have a few questions building from there.

(cribbing a random mixed-units linear programming example from here, although this is not a homework question as will hopefully become clear)

Bob’s bakery sells bagel and muffins. To bake a dozen bagels Bob needs 5 cups of flour, 2 eggs, and one cup of sugar. To bake a dozen muffins Bob needs 4 cups of flour, 4 eggs and two cups of sugar. Bob can sell bagels in $10/dozen and muffins in $12/dozen. Bob has 50 cups of flour, 30 eggs and 20 cups of sugar. How many bagels and muffins should Bob bake in order to maximize his revenue?

So let's put that in matrix form (arbitrary concrete syntax...):

 A = [ [ 5 cups of flour / dozen bagels, 4 cups of flour / dozen muffins ],
       [ 2 eggs          / dozen bagels, 4 eggs          / dozen muffins ],
       [ 1 cups of sugar / dozen bagels, 2 cups of sugar / dozen muffins ] ]

 B = [ [ 10 dollars      / dozen bagels, 12 dollars      / dozen muffins ] ]

 C = [ [ 50 cups of flour ],
       [ 30 eggs          ],
       [ 20 cups of sugar ] ]

We now want to maximize the inner product B * X such that A * X <= C and X >= 0, an ordinary linear programming problem.

In a hypothetical language with unit checking, how would we ideally represent the types of these matrices?

I'm thinking that an m by n matrix only needs m + n units and not the full m * n units, because unless the units are distributed in a sensible way into rows and columns then the only sensible operation remaining is to add/subtract the fully general matrix with another of the exact same shape or multiply it by a scalar.

What I mean is that the arrangement of units in A is far more useful than that in:

WTF = [ [ 6 pigeons,      11 cups of sugar ],
        [ 1 cup of sugar, 27 meters        ],
        [ 2 ohms,         2 meters         ] ]

And that furthermore situations like the latter simply don't arise in practice. (Anyone got a counterexample?)

Under this simplifying assumption, we can represent the units of a matrix with m + n units as follows. For each of the m rows, we figure out what units are common to all entries in that row, and similarly for the n columns. Let's put the row units in column vectors and the column units in row vectors because that makes Units(M) = RowUnits(M) * ColUnits(M), which seems like a nice property. So, in the example:

RowUnits(A) = [ [ cups of flour ],
                [ eggs ],
                [ cups of sugar ] ]
ColUnits(A) = [ [ dozen bagels ^ -1, dozen muffins ^ -1 ] ]

RowUnits(B) = [ [ dollars ] ]
ColUnits(B) = [ [ dozen bagels ^ -1, dozen muffins ^ -1 ] ]

RowUnits(C) = [ [ cups of flour ],
                [ eggs ],
                [ cups of sugar ] ]
ColUnits(C) = [ [ 1 ] ]

It seems that (although I'm not sure how to prove it...) the units of M1 * M2 are RowUnits(M1 * M2) = RowUnits(M1), ColUnits(M1 * M2) = ColUnits(M2), and for the multiplication to make sense we require ColUnits(M1) * RowUnits(M2) = 1.

We can now infer units for X, because the expression A * X <= C must mean that A * X and C have the same units. This means that RowUnits(A) = RowUnits(C) (which checks out), ColUnits(X) = ColUnits(C), and RowUnits(X) is the element-wise reciprocal of the transpose of ColUnits(A), in other words RowUnits(X) = [ [ dozen bagels ], [ dozen muffins ] ].

("Hooray", I hear you cheering, "we have just gone around the moon to look at something completely obvious!")

My questions are these:

Can you think of real world examples where elements of a matrix have units that do not fall into "row units" and "column units" like this?
Can you think of an elegant way to deal with situations where the same unit is a factor in every cell, and so it could equivalently be placed in every "row" or in every "column" and thus the row units and column units are not a unique representation? What should the side condition be that holds them in "lowest terms" and removes silliness like multiplying every row by furlongs ^ 17 just so that you can multiply every column by furlongs ^ -17?
Can you prove the rules I mentioned for propagating these unit annotations through matrix multiplications?
Can you discover/show the rule for how these unit annotations propagate through matrix inverse operations? Some hand calculations I did with a 2x2 matrix suggest that the units of Inverse(M) are the element-wise reciprocal of the units of Transpose(M), but I don't know how to show it for the general case or even if it is true for the general case.
Are you aware of any academic work on these issues? Or software that performs this static analysis for programs in some language? I may be using the wrong search terms, but I am having trouble finding anything.

My real-world applications of interest are preventing screw ups in signal processing/controller software by making sure that all the filter gains etc have the correct units everywhere, using matrices like these with different units in different cells is extremely common in those applications.

Solution

4) If you admit that inv(M)*M = Identity is dimensionless (Id*X=X), this is the same proof as 3)

5) I would inquire if it is possible to extend BOOST UNITS http://www.boost.org/doc/libs/1_50_0/doc/html/boost_units.html and eventually contact the authors

1 & 3)
Your dimensional problem Adim*Xdim=Ydim can be transformed into a dimensionless problem A*X=Y

you divide your input parameters X by their respective units(X)
and you multiply your outputs Y by their respective units(Y) afterwards

That is

Ydim=diag(units(Y))*Y
Ydim=diag(units(Y))*A*X
Ydim=diag(units(Y))*A*inv(diag(units(X)))*Xdim
Ydim=Adim*Xdim

Or the other way

Adim*Xdim=Ydim
Adim*diag(units(X))*X=diag(units(Y))*Y
Adim*diag(units(X))*X=diag(units(Y))*A*X

This is true for whatever X, so you find that

A=inv(diag(units(Y))) * Adim * diag(units(X))
Adim=diag(units(Y)) * A * inv(diag(units(X)))

That is you multiply rows of dimensionless A with units of Y and columns by inverse of units of X

2) Mathematically you trivially factor a scalar quantity out of the matrix, so I think the question is more related to 5), how do you represent this in soft. You'll have to take this feature as a requirement when answering 5)

OTHER TIPS

I think that good real-world example - an answer to your question No 1 - could be Discrete Kalman Filter implementation. In general, its equations operate on tuples and matrices where some of them might represent values that correspond to physical units.

As long as there is a need for multidimensional Kalman filter calculation, where estimated results are not homogeneous in value type, it looks to me that there will be are inversions and transpositions, as well as multiplication between matrices having certain (symmetrical) mixture of values representing different physical things on their elements.

I've just touched the problem when trying to implement sensor fusion algorithm using DKF with C++ and strongly typed value representation in the code.

There are plenty of Kalman filter implementations where developers simply use vectors and matrices of integers, floats, doubles etc, but I'd like to keep track of dimension of every values used and then the issue appeared.

Unfortunately, I'm quite fresh to the problem (still working on it), so I'm not yet able to help you with answers 2 to 5 right now :-/

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow