Pergunta

I am just getting started with my first simple digit classifier, so my doubts are at a pretty low level. In every dataset of digit images I've seen so far, different variants of each digit are grouped together, for example:

enter image description here enter image description here enter image description here

All of these images represent the number 1, but are fairly different in looks. Won't simple convolutional neural networks have a hard time learning the visual pattern for 1 in such a case? Especially considering how the third image is similar to 7 in design.

My questions are these: Would it be better to create other labels such as "1", "1-alt", "1-serif" etc? The CNN can then add the probabilities of the image being a variant of 1 and then give its prediction, but I'm not sure about this.
How do professional classifiers approach this problem?
Theoretically, will this method affect performance or accuracy in any way?

Foi útil?

Solução

Interesting question. You are right in assuming that some 1s may be confused for 7s, same with 8s and 3s for instance.

Generally creating different classes as you suggest doesn't happen, simply because it would require more annotation.

There are multiple ways to handle this.

  • stacked models

Anything labelled as a 1 or a 7 would be given to a model fine-tuned to differentiate these two specific digits

  • weighted training

It is possible to teach your model that some mistakes are worse than others. In your case, 1s for 7s is bad, so you could increase the cost of making that mistake.

Licenciado em: CC-BY-SA com atribuição
scroll top