Pergunta

I'm trying to develop a new weight initialization method but i'm getting a weird training phenomenon. You can see that output node 8 is never the max activation... enter image description here

I'm using the matlab patternnet with tansig activation, mse performance, and no bias nodes. I'm trying to classify a subset of the mnist database.

Does anyone have any ideas how to troubleshoot this? Using nguyen-widrow initialization does not see this result, despite having the same architecture.

edit:

Inputs: 768xN of values between 0 and 1

Targets: 10xN of values 0 or 1 per respective row. So its like a logic matrix with 1 true per column.

One or more nodes do not activate, i showed the best case.

This occurs with one or more layers (1 to 5), less or more training data (1k to 10k samples.)

Foi útil?

Solução

I think i found a solution to the problem.

By scaling the weights to be only along the significant domain of the transfer function (-1 to 1), i no longer saw this phenomena.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top