Single vs Multiple deep learning networks for multi-label classification?

https://datascience.stackexchange.com/questions/24296

31-10-2019
|

Question

Given a machine reaches a broken state, there are potentially fixes that can to be applied to get the machine to run again. We'd like to know if, for the problem further defined below, with millions of datapoints, hundreds of features, and tens of labels, should we be looking at using a single deep neural network with multiple outputs, or create an ensemble of binary networks with human-selected features presumed better for their relevance? Is there a standard approach?

The state of the machine is captured in potentially hundreds of features, a good mix of continuous values and categorical data. We have millions of documented cases of machine states, and can identify when a machine had broke and what fixes were applied to get it running again. There are less than 40 fixes we are interested in, and we are considering a bucket for "other" fixes, and a bucket for "unfixable".

We are treating this as a multi-label classification, because it may take multiple discrete fixes (Fix-1, Fix-2...Fix-N) to get the machine up and running again. Not all features would be relevant to each fix, so the question we're wondering is whether each fix should have its own binary classification (each outputing a single value representing Fix-i or not Fix-i) network with what we think are the relevant features, or should we create one giant deep neural network with multiple labels, sigmoid over each fix (Fix-1, Fix-2...Fix-N). With the latter approach, should combinations of fixes be represented as their own label (Fix-1, Fix-2, Fix-1&2, Fix-3, Fix-1&3, Fix-2&3, Fix-1&2&3 ... etc).

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange