Decoding problem and conditional probabilities

https://cs.stackexchange.com/questions/116592

06-11-2019
|

Question

I'm reading the book by MacKay "Information theory, inference and learning algorithms" and I'm confused by how he introduces the decoding problem for LDPC codes (page 557).

given a channel output $\mathbf{r}$, we wish to find the codeword $\mathbf{t}$ that whose likelihood $P(\mathbf{r}|\mathbf{t})$ is biggest.

I can get on board with that, even if it seems a bit backwards since $\mathbf{r}$ is the evidence we have, so I would be more comfortable if we were trying to find $\mathbf{t}$ that maximizes $P(\mathbf{t}|\mathbf{r})$, and as a matter of fact, he goes on to say

there are two approaches to the decoding problem both of which lead to the generic problem 'find $\mathbf{x}$ that maximizes $$ P^*(\mathbf{x})=P(\mathbf{x})\mathbb{1}[\mathbf{Hx}=\mathbf{z}]$$

he exposes the two points of view in the following page, the first one especially confuses me

the codeword decoding point of view

First, we note the prior distribution over codewords $$ P(\mathbf{t})=\mathbb{1}[\mathbf{Ht}=\mathbf{0}]$$[...]. The posterior distribution over codewords is given by multiplying the prior by the likelihood [...] $$P(\mathbf{t}|\mathbf{r})\propto P(\mathbf{t})P(\mathbf{r}|\mathbf{t}) $$

from the generic decoding problem he gave, it looks like we're actually trying to maximize $P(\mathbf{x})=P(\mathbf{t}|\mathbf{r})$, instead of $P(\mathbf{r}|\mathbf{t})$.

Is it obvious that the two maxima are the same and with the same maximizer? It is not obvious to me since the maximizer $\mathbf{t_0}$ that maximizes $P(\mathbf{r}|\mathbf{t})$ might actually minimize $P(\mathbf{t})$, so the maximizer of $P(\mathbf{t})P(\mathbf{r}|\mathbf{t})$ might be different! I understand that in this case $P(\mathbf{t})$ is uniform, so this shouldn't be a problem, but it seems weird to me that this is simply not stated and I feel like I'm missing something.

Why should we start with the problem of maximizing $P(\mathbf{r}|\mathbf{t})$ and not $P(\mathbf{t}|\mathbf{r})$? Why does he seem to switch after a few sentences? Am I correct in thinking the algorithm presented actually maximizes $P(\mathbf{t}|\mathbf{r})$?

Thank you

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange