A sum of products? sounds like a job for np.einsum:
import numpy as np
N = 150
K = 3
M = 4
x = np.random.random((N,M))
mu = np.random.random((K,M))
gamma = np.random.random((N,K))
xbar = x-mu[:,None,:] # shape (3, 150, 4)
sigma = np.einsum('nk,knm,kno->kmo', gamma, xbar, xbar)
sigma /= gamma.sum(axis=0)[:,None,None]
Decoding 'nk,knm,kno->kmo'
:
This subscript specification has three components to the left of the array (->
) followed by one component to the right.
The three components on the left correspond with the subscripts for gamma
, xbar
and xbar
, the operands being passed to np.einsum
.
gamma
has subscript nk
, just as in the formula you posted.
xbar
has shape (3, 150, 4). You can think of it as having subscript knm
, where k
and n
have the same meaning as in the formula you posted, and m
is a subscript representing the axis of length 4 which is not explicitly mentioned in your formula, but is apparently there given your description of the shape of the arrays.
Now the third subscript component is kno
. The o
subscript is used because the o
plays the same role as the m
subscript, but we don't want summation over m
. In fact, we want the m
and o
subscripts to be iterated over independently, not in lock-step. Hence we give the third subscript different letters.
Notice that n
appears in the subscripts on the left (nk, knm, kno
), but does not appear on the right (in kmo
). That tells np.einsum
to sum over n
.
The k
appears in the subscripts on the left and on the right. This tells np.einsum
that we wish to advance the k
subscript in lock-step, but (since it appears on the right) we don't want to sum over k
.
Since kmo
appears on the right, these subscripts remain in the result. This results in sigma
having shape (K,M,M)
(i.e. (3,4,4)).