Question

I was self learning about entropy and came across this equation. $$ H = - \sum p(x) \log p(x) $$

The equation for entropy in expected value, $$ H(x) = \operatorname*{\mathbb{E}}_{X \sim P}[I(x)] = -\operatorname*{\mathbb{E}}_{X \sim P}[\log P(x)]. $$

But the expected value is written as

$$ \mathbb{E}[X] = \sum_{i=1}^k x_i p_i = x_1p_1 + x_2p_2 + \cdots + x_k p_k $$

Using the above expected value formula, I expected the entropy equation looks something like this

$$H(x)= -\operatorname*{\mathbb{E}}_{x \sim P(x)}[\log P(x)]= - \sum xP(x)\log P(x) $$

where is the $x$ gone in the real entropy formula in summation notation?

Was it helpful?

Solution

Here is the definition of the expectation of a discrete random variable $Y$: $$ \mathbb{E}[Y] = -\sum_y \Pr[Y = y] \cdot y. $$ In your case, $Y = \log P(X)$, where $X \sim P$. Therefore $$ \mathbb{E}[X] = \sum_y \Pr[\log P(X) = y] \cdot y. $$ Notice that $$ \Pr[-\log P(X) = y] = \sum_{x\colon \log P(x)=y} \Pr[X = x] \cdot y = \sum_{x\colon \log P(x)=y} \Pr[X = x] \cdot \log P(x). $$ Therefore $$ \mathbb{E}[X] = \sum_y \sum_{x\colon \log P(x) = y} \Pr[X = x] \cdot \log P(x) = \sum_x \Pr[X = x] \log P(x) = \sum_x P(x) \log P(x). $$

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top