Question

I want to make a model which predicts the future response of the input signal, the architecture of my network is [3, 5, 1]:

  • 3 inputs,
  • 5 neurons in the hidden layer, and
  • 1 neuron in output layer.

My questions are:

  1. Should we have separate BIAS for each hidden and output layer?
  2. Should we assign weight to BIAS at each layer (as BIAS becomes extra value to our network and cause the over burden the network)?
  3. Why BIAS is always set to one? If eta has different values, why we don't set the BIAS with different values?
  4. Why we always use log sigmoid function for non linear functions, can we use tanh ?
Was it helpful?

Solution

So, I think it'd clear most of this up if we were to step back and discuss the role the bias unit is meant to play in a NN.

A bias unit is meant to allow units in your net to learn an appropriate threshold (i.e. after reaching a certain total input, start sending positive activation), since normally a positive total input means a positive activation.

For example if your bias unit has a weight of -2 with some neuron x, then neuron x will provide a positive activation if all other input adds up to be greater then -2.

So, with that as background, your answers:

  1. No, one bias input is always sufficient, since it can affect different neurons differently depending on its weight with each unit.
  2. Generally speaking, having bias weights going to every non-input unit is a good idea, since otherwise those units without bias weights would have thresholds that will always be zero.
  3. Since the threshold, once learned should be consistent across trials. Remember the bias represented how each unit interacts with the input; it isn't an input itself.
  4. You certainly can and many do. Any sqaushing function generally works as an activation function.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top