Neural Net backpropagation doesn't work properly

https://stackoverflow.com/questions/20313110

06-08-2022
|

Question

Lately I've implemented my own neural network (using different guides, but mainly from here), for future use (I intend to use it for an OCR program i'l develop). currently I'm testing it, and I'm having this weird problem.

Whenever I give my network a training example, the algorithm changes the weights in a way that leads to the desired output. However, after a few training examples, the weights get messed up- making the network work well for some outputs, and making it wrong for other outputs (even if I enter the input of the training examples, exactly as it was).

I would appreciate if someone directed me towards the problem, should they see it. Here are the methods for calculating the error of the neurons and the weight adjusting-

    private static void UpdateOutputLayerDelta(NeuralNetwork Network, List<double> ExpectedOutputs)
    {
        for (int i = 0; i < Network.OutputLayer.Neurons.Count; i++)
        {
            double NeuronOutput = Network.OutputLayer.Neurons[i].Output;
            Network.OutputLayer.Neurons[i].ErrorFactor = ExpectedOutputs[i]-NeuronOutput; //calculating the error factor
            Network.OutputLayer.Neurons[i].Delta = NeuronOutput * (1 - NeuronOutput) * Network.OutputLayer.Neurons[i].ErrorFactor; //calculating the neuron's delta
        }
    }

    //step 3 method
    private static void UpdateNetworkDelta(NeuralNetwork Network)
    {
        NeuronLayer UpperLayer = Network.OutputLayer;
        for (int i = Network.HiddenLayers.Count - 1; i >= 0; i--)
        {
            foreach (Neuron LowerLayerNeuron in Network.HiddenLayers[i].Neurons)
            {
                for (int j = 0; j < UpperLayer.Neurons.Count; j++)
                {
                    Neuron UpperLayerNeuron = UpperLayer.Neurons[j];
                    LowerLayerNeuron.ErrorFactor += UpperLayerNeuron.Delta * UpperLayerNeuron.Weights[j + 1]/*+1 because of bias*/;
                }
                LowerLayerNeuron.Delta = LowerLayerNeuron.Output * (1 - LowerLayerNeuron.Output) * LowerLayerNeuron.ErrorFactor;
            }
            UpperLayer = Network.HiddenLayers[i];
        }
    }

    //step 4 method
    private static void AdjustWeights(NeuralNetwork Network, List<double> NetworkInputs)
    {
        //Adjusting the weights of the hidden layers
        List<double> LowerLayerOutputs = new List<double>(NetworkInputs);
        for (int i = 0; i < Network.HiddenLayers.Count; i++)
        {
            foreach (Neuron UpperLayerNeuron in Network.HiddenLayers[i].Neurons)
            {
                UpperLayerNeuron.Weights[0] += -LearningRate * UpperLayerNeuron.Delta;
                for (int j = 1; j < UpperLayerNeuron.Weights.Count; j++)
                    UpperLayerNeuron.Weights[j] += -LearningRate * UpperLayerNeuron.Delta * LowerLayerOutputs[j - 1] /*-1 because of bias*/;
            }
            LowerLayerOutputs = Network.HiddenLayers[i].GetLayerOutputs();
        }

        //Adjusting the weight of the output layer
        foreach (Neuron OutputNeuron in Network.OutputLayer.Neurons)
        {
            OutputNeuron.Weights[0] += -LearningRate * OutputNeuron.Delta * 1; //updating the bias - TODO: change this if the bias is also changed throughout the program
            for (int j = 1; j < OutputNeuron.Weights.Count; j++)
                OutputNeuron.Weights[j] += -LearningRate * OutputNeuron.Delta * LowerLayerOutputs[j - 1];
        }
    }

The learning rate is 0.5, and the neurons' activation function is a sigmoid function.

EDIT: I've noticed I never implemented the function to calculate the overall error: E=0.5 * Sum(t-y) for each training example. could that be the problem? and if so, how should I fix it?

Solution

The learning rate 0.5 seems a bit too large. Usually values closer to 0.01 or 0.1 are used. Also, it usually helps in convergence if training patterns are presented in random order. More useful hints can be found here: Neural Network FAQ (comp.ai.neural archive).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow