ANN: Recursive backpropagation

https://stackoverflow.com/questions/17535800

02-06-2022
|

Question

I am trying to implement backpropagation with recursion for academic purposes, but it seems I have gone wrong somewhere. Have been tinkering with it for a while now but either get no learning at all or no learning on the second pattern.

Please let me know where I've gone wrong. (This is javascript syntax) Note: errors are reset to null before every learning cycle.

this.backpropagate = function(oAnn, aTargetOutput, nLearningRate) {
    nLearningRate = nLearningRate || 1;

    var oNode, 
        n = 0;

    for (sNodeId in oAnn.getOutputGroup().getNodes()) {
        oNode = oAnn.getOutputGroup().getNodes()[sNodeId];
        oNode.setError(aTargetOutput[n] - oNode.getOutputValue());
        n ++;
    }

    for (sNodeId in oAnn.getInputGroup().getNodes()) {
        this.backpropagateNode(oAnn.getInputGroup().getNodes()[sNodeId], nLearningRate);
    }
}

this.backpropagateNode = function(oNode, nLearningRate) {
    var nError = oNode.getError(),
        oOutputNodes,
        oConn,
        nWeight,
        nOutputError,
        nDerivative = oNode.getOutputValue() * (1 - oNode.getOutputValue()), // Derivative for sigmoid activation funciton
        nInputValue = oNode.getInputValue(),
        n;

    if (nError === null /* Dont do the same node twice */ && oNode.hasOutputs()) {

        nDerivative = nDerivative || 0.000000000000001;
        nInputValue = nInputValue || 0.000000000000001;

        oOutputNodes = oNode.getOutputNodes();

        for (n=0; n<oOutputNodes.length; n++) {
            nOutputError = this.backpropagateNode(oOutputNodes[n], nLearningRate);

            oConn   = oAnn.getConnection(oNode, oOutputNodes[n]);
            nWeight = oConn.getWeight();
            oConn.setWeight(nWeight + nLearningRate * nOutputError * nDerivative * nInputValue);
            nError += nOutputError * nWeight;
        }
        oNode.setError(nError);
    }

    return oNode.getError();
}

Solution

Resolved it. Apparently lower-dimensional networks are more likely to get stuck in a local minima. This is easy to grasp knowing that higher-dimensional networks are less likely to achieve any minima, even global.

Implementing momentum that increases with each iteration gets me through most of the minima. So, re-initializing weights to random (-0.5 to 0.5) values and conducting multiple training sessions eventually gets me through all of them.

I am happy to announce that my network now gets through training in 100% of cases if data is classifiable.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow