Question

I have recently confronted with a (at least for me) new kind of ML problem, where the output of the model should be a vector/matrix (depending on the interpretation, but there is no difference actually), not a scalar as usual. This is totally unknown for me. What kind of approach should one apply here? Are the "usual" (scalar-based) models applicable on this problem?

(Just for the sake of completeness, the problem is an image segmentation task where the model should decide first: if there a given pattern on the picture?, second: if so, where is it? - In latter case, it should define the borders of the subset pixels).

Was it helpful?

Solution

Neural Networks can have a vector or matrix as output layer, image segmentation is a well researched topic and deep learning (as most things concerning images) are the state-of-the-art. You will need (a lot of) training examples where the pattern is found, and where. This could be a bounding box, or per pixel if it is part of the pattern or not (this will generate a matrix equally sized to your input). To see if the pattern is found you could construct a second network that is just a binary classifier, or you could try to see if your pixel-based network will output almost only zeros in case of no pattern. In this case you will need negative examples as well.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top