Question

Assume you have the following dataset, where the two variables Color and Size are observed:

Color | Size 
------+------
Red   | Big 
White | Small
Red   | Small
Red   | Big
White | Big
Red   | Big

You are asked to learn the maximum likelihood parameters for the Bayesian network shown below:

Color -> Size

You get more data for the learning problem described in the table but the new dataset contains missing values. Which algorithm can you use to learn the maximum likelihood parameters now?

Was it helpful?

Solution

If you just throw away the cases with missing values, you will get inaccurate values.

So you need to make predictions on the probabilities instead, and you can use the Expectation Maximisation algorithm for this. http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top