質問

I have two classes, $p(x|y=0)$ and $p(x|y=1)$ with ${{\mu }_{0}}$ and ${{\mu }_{1}}$ as mean and shared covariance matrix $\Sigma $. Now, I have a missing feature ${{x}_{n}}$ for a particular observation. We replace this missing value by the class conditional mean $E({{x}_{n}}|y=0)$ and $E({{x}_{n}}|y=1)$. How do I justify this missing value treatment.

This is actually an exercise. But, something that I noticed is that if $p(x|y=0)$ and $p(x|y=1)$ have means ${{\mu }_{0}}$ and ${{\mu }_{1}}$, then how is their covariance matrix the same?

Assuming their covariance matrix is same, the only advantage I see is that if I take class conditional mean as missing value then in multivariate Gaussian, the effect of this is that the $n^{th}$ row of $\left( x-\mu \right)$ vector will be $0$ since $\left( {{x}_{n}}-{{\mu }_{n}} \right)$ will be $0$, thus making no impact in calculating $p(y=0|x)$ and $p(y=1|x)$.

Is this explanation correct?

正しい解決策はありません

ライセンス: CC-BY-SA帰属
所属していません datascience.stackexchange
scroll top