How do POMDPs and Dynamic Influence Diagrams differ?

https://cs.stackexchange.com/questions/50053

03-11-2019
|

Pregunta

To give some perspective, first consider the following diagram comparing Markov Chains, HMMs, MDPs, and POMDPs (I'm not sure who to credit for it).

                    Fully observable          Partially observable
                _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
               |                         |                           |
    no actions |      Markov chain       |           HMM             |
               |_ _ _ _ _ _ _ _ _ _ _ _ _|_ _ _ _ _ _ _ _ _ _ _ _ _ _|
               |                         |                           |
    actions    |          MDP            |          POMDP            |
               |_ _ _ _ _ _ _ _ _ _ _ _ _|_ _ _ _ _ _ _ _ _ _ _ _ _ _|

Recall that an HMM allows us to model probability distributions over a sequence of observations. Bayesian networks (not pictured) are a generalization of HMMs which model conditional distributions over sets of random variables (see here for a description). When modeling a problem over time, one appends a time index to the model resulting in a dynamic Bayesian network.

A tool known as a dynamic influence diagram extends dynamic Bayesian networks to decision-making problems through the inclusion of actions that can effect the evolution of the problem.

My question is: how do dynamic influence diagrams and POMDPs compare? On the surface they seem like they are modeling the same problem type. What sort of problems are amenable to each tool?

No hay solución correcta

Licenciado bajo: CC-BY-SA con atribución

No afiliado a cs.stackexchange