Entropy of a probability distribution in Excel
-
03-07-2021 - |
Question
I have a matrix in excel. I need to normalize rows and then calculate entropy of each row (considering it as a probability distribution).
For e.g. suppose my matrix is:
2 0 3 5
0 1 0 0
1 0 3 2
After row normalization the matrix becomes:
0.2000 0 0.3000 0.5000
0 1.0000 0 0
0.1667 0 0.5000 0.3333
Assuming each row is a probability distribution, the entropy of each row is:
1.0297
0
1.0114
I want to calculate above entropy values without producing intermediate row-normalized matrix.
Is it possible to do this in Excel?
Note: Entropy of a probability distribution is defined as:
H(X) = sum over all x {-p(x) * log(p(x))}
Solution
If you have your original matrix in A1:D3 try this formula in F1
=SUM(-A1:D1/SUM(A1:D1)*IF(A1:D1<>0,LN(A1:D1/SUM(A1:D1))))
confirmed with CTRL+SHIFT+ENTER (so that curly braces appear around the formula in the formula bar)
copy to F3
OTHER TIPS
Assuming your entropy is defined by x ln x, I'd suggest the following:
- Create a matrix that computes ln(x) for each original cell: IF(X>0;LN(X);0)
- Create a second matrix that multiplies the x and the ln(x) matrix
- Compute the row sums: SUM(A1:A4)
I don't know how to do this without intermediate matrices, though. Why would you want this?