Is there any domain where Bayesian Networks outperform neural networks?

https://datascience.stackexchange.com/questions/9818

16-10-2019
|

سؤال

Neural networks get top results in Computer Vision tasks (see MNIST, ILSVRC, Kaggle Galaxy Challenge). They seem to outperform every other approach in Computer Vision. But there are also other tasks:

Kaggle Molecular Activity Challenge
Regression: Kaggle Rain prediction, also the 2nd place
Grasp and Lift 2nd also third place - Identify hand motions from EEG recordings

I'm not too sure about ASR (automatic speech recognition) and machine translation, but I think I've also heard that (recurrent) neural networks (start to) outperform other approaches.

I am currently learning about Bayesian Networks and I wonder in which cases those models are usually applied. So my question is:

Is there any challenge / (Kaggle) competition, where the state of the art are Bayesian Networks or at least very similar models?

(Side note: I've also seen decision trees, 2, 3, 4, 5, 6, 7 win in several recent Kaggle challenges)

المحلول

One of the areas where Bayesian approaches are often used, is where one needs interpretability of the prediction system. You don't want to give doctors a Neural net and say that it's 95% accurate. You rather want to explain the assumptions your method makes, as well as the decision process the method uses.

Similar area is when you have a strong prior domain knowledge and want to use it in the system.

نصائح أخرى

Bayesian networks and neural networks are not exclusive of each other. In fact, Bayesian networks are just another term for "directed graphical model". They can be very useful in designing objective functions neural networks. Yann Lecun has pointed this out here: https://plus.google.com/+YannLeCunPhD/posts/gWE7Jca3Zoq.

One example.

The variational auto encoder and derivatives are directed graphical models of the form $$p(x) = \int_z p(x|z)p(z) dz.$$ A neural networks is used to implemented $p(x|z)$ and an approximation to its inverse: $q(z|x) \approx p(z|x)$.

Excellent answers already.

One domain which I can think of, and is working extensively in, is the customer analytics domain.

I have to understand and predict the moves and motives of the customers in order to inform and warn both the customer support, the marketing and also the growth teams.

So here, neural networks do a really good job in churn prediction, etc. But, I found and prefer the Bayesian networks style, and here are the reasons for preferring it:

Customers always have a pattern. They always have a reason to act. And that reason would be something which my team has done for them, or they have learnt themselves. So, everything has a prior here, and in fact that reason is very important as it fuels most of the decision taken by the customer.
Every move by the customer and the growth teams in the marketing/sales funnel is cause-effect. So, prior knowledge is vital when it comes to converting a prospective lead into a customer.

So, the concept of prior is very important when it comes to customer analytics, which makes the concept of Bayesian networks very important to this domain.

Suggested Learning:

Bayesian Methods for Neural Networks

Bayesian networks in business analytics

Sometimes you care as much about changing the outcome as predicting the outcome.

A neural network given enough training data will tend to predict the outcome better, but once you can predict the outcome, you then may wish to predict the effect of making changes in the input features on the outcome.

An example from real life, knowing that someone is likely to have a heart attack is useful, but being able to tell the person that if they stopped doing XX, the risk would reduce by 30% is of much greater benefit.

Likewise for customer retention, knowing why customers stop shopping with you, is worth as much as predicting the customers that are likely to stop shopping with you.

Also a simpler Bayesian Network that predicts less well but leads to more action being taken may often be better than a more “correct” Bayesian Network.

The biggest advantage of Bayesian networks over neural networks is that they can be used for causal inference . This branch is of fundamental importance to statistics and machine learning and Judea Pearl has won the Turing award for this research.

Bayesian networks might outperform Neural Networks in small data setting. If the prior information is properly managed via the network structure, priors and other hyperparameters, it might have an edge over Neural Networks. Neural Networks, especially the ones with more layers, are very well known to be data hungry. Almost by definition lots of data is necessary to properly train them.

I've posted this link on Reddit and got a lot of feedback. Some have posted their answers here, others didn't. This answer should sum the reddit post up. (I made it community wiki, so that I don't get points for it)

Auto-Encoding Variational Bayes is a combination of a Bayes Network and a neural network. The paper Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference seems to go in the same direction.
Dropout: A Simple Way to Prevent Neural Networks from Overfitting is an example where Bayesian neural networks outperform their dropout approach (see section 6.4 "Comparison with Bayesian Neural Networks")
Human-level concept learning through probabilistic program induction is a paper "on a Bayesian net network that does one-shot classification that way outperformed neural networks" (according to trashacount12345 - I didn't check that by now).
Yann LeCun wrote a Google+ post in which he argues that neural networks and probabilisitc graphical models are not orthogonal concepts.

Bayesian networks are preferred for genome interpretation. See, for example, this dissertation discussing computational methods for genome interpretation.

I did a small example for this once. From that, I think Bayesian Networks are preferred if you want to capture a distribution but your input training set doesn't cover the distribution well. In such cases, even a neural network that generalised well would not be able to reconstruct the distribution.

I strongly do not agree that neural nets do well then other learners. In fact neural nets are doing pretty bad compared to other methods. There is also no methodology despite some advices on choosing parameters this beeing done very often by chance. There are some dudes also that talk random on forums about how neural nets are so good, not because they have some evidence regarding that, but because they are atracted about the fancy and buzz word ,, neural''.They are also very unstable , did you tryied a neural net to compare with xgboost?I will not try any neural net until it will be self concious .So until then happy neural neting :)

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى datascience.stackexchange