Question

I have a matrix in excel. I need to normalize rows and then calculate entropy of each row (considering it as a probability distribution).

For e.g. suppose my matrix is:

2   0   3   5
0   1   0   0
1   0   3   2

After row normalization the matrix becomes:

0.2000         0    0.3000    0.5000
     0    1.0000         0         0
0.1667         0    0.5000    0.3333

Assuming each row is a probability distribution, the entropy of each row is:

1.0297
     0
1.0114

I want to calculate above entropy values without producing intermediate row-normalized matrix.

Is it possible to do this in Excel?

Note: Entropy of a probability distribution is defined as:

H(X) = sum over all x {-p(x) * log(p(x))}
Was it helpful?

Solution

If you have your original matrix in A1:D3 try this formula in F1

=SUM(-A1:D1/SUM(A1:D1)*IF(A1:D1<>0,LN(A1:D1/SUM(A1:D1))))

confirmed with CTRL+SHIFT+ENTER (so that curly braces appear around the formula in the formula bar)

copy to F3

OTHER TIPS

Assuming your entropy is defined by x ln x, I'd suggest the following:

  1. Create a matrix that computes ln(x) for each original cell: IF(X>0;LN(X);0)
  2. Create a second matrix that multiplies the x and the ln(x) matrix
  3. Compute the row sums: SUM(A1:A4)

I don't know how to do this without intermediate matrices, though. Why would you want this?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top