Is the direction of edges in a Bayes Network irrelevant?

https://datascience.stackexchange.com/questions/10064

bayesian-networks

16-10-2019
|

Question

Today, in a lecture it was claimed that the direction of edges in a Bayes network doesn't really matter. They don't have to represent causality.

It is obvious that you cannot switch any single edge in a Bayes network. For example, let $G = (V, E)$ with $V = \{v_1, v_2, v_3\}$ and $E=\{(v_1, v_2), (v_1, v_3), (v_2, v_3)\}$. If you would switch $(v_1, v_3)$ to $(v_3, v_1)$, then $G$ would no longer be acyclical and hence not a Bayes network. This seems to be mainly a practical problem how to estimate the probabilities then. This case seems to be much more difficult to answer, so I will skip it.

This made me ask the following questions for which I hope to get answers here:

Is it possible for any directed acyclical graph (DAG) to reverse all edges and still have a DAG?
Assume a DAG $G$ and data is given. Now we construct the inverse DAG $G_\text{inv}$. For both DAGs, we fit the data to the corresponding Bayes networks. Now we have a set of data for which we want to use the Bayes network to predict the missing attributes. Could there be different results for both DAGs? (Bonus if you come up with an example)
Similar to 2, but simpler: Assume a DAG $G$ and data is given. You may create a new graph $G'$ by inverting any set of edges, as long as $G'$ remains acyclical. Are the Bayes networks equivalent when it comes to their predictions?
Do we get something if we have edges which do represent causality?

Solution

TL;DR: sometimes you can make an equivalent Bayesian network by reversing arrows, and sometimes you can't.

Simply reversing the direction of the arrows yields another directed graph, but that graph is not necessarily the graph of an equivalent Bayesian network, because the dependence relations represented by the reversed-arrow graph can be different from those represented by the original graph. If the reversed-arrow graph represents different dependence relations than the original, in some cases it's possible to create an equivalent Bayesian network by adding some more arrows to capture dependence relations which are missing in the reversed-arrow graph. But in some cases there is not an exactly equivalent Bayesian network. If you have to add some arrows in order to capture dependencies, you might end up with a graph which represents fewer independence relations and therefore fewer opportunities for simplifying the computations of posterior probabilities.

For example, a -> b -> c represents the same dependencies and independencies as a <- b <- c, and the same as a <- b -> c, but not the same as a -> b <- c. This last graph says that a and c are independent if b is not observed, but a <- b -> c says a and c are dependent in that case. We can add an edge directly from a to c to capture that, but then a and c being independent when b is observed is not represented. That means that there is at least one factorization which we cannot exploit when computing posterior probabilities.

All this stuff about dependence/independence, arrows and their reversals, etc., is covered in standard texts on Bayesian networks. I can dig out some references if you want.

Bayesian networks don't express causality. Judea Pearl, who did a lot of work on Bayesian networks, has also worked on what he calls causal networks (essentially Bayesian networks annotated with causal relations).

OTHER TIPS

This might be a bit unsatisfying, so feel free not to accept this answer, and apologies in advance.

In a Bayes net, nodes represent random variables, and edges represent conditional dependences. When you interpret the nodes a certain way, conditioning flows a certain way naturally. Arbitrarily reversing them doesn't really make sense in the context of modeling data. And a lot of time, the arrows do represent causality.

Question 3

synergy.st-andrews.ac.uk/vannesmithlab claims that the graphs

G1 = o->o->o and
G2 = o<-o->o

are in one equivalence class. According to that source, the models represents exactly the same joint probability distribution.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange