Backpropagation algorithm (Matlab): output values are saturating to 1

https://stackoverflow.com/questions/14982358

10-03-2022
|

Question

I have coded up a backpropagation algorithm in Matlab based on these notes: http://dl.dropbox.com/u/7412214/BackPropagation.pdf

My network takes input/feature vectors of length 43, has 20 nodes in the hidden layer (arbitrary parameter choice I can change), and has a single output node. I want to train my network to take the 43 features and output a single value between 0 and 100. The input data was normalized to zero mean and unit standard deviation (via z = x - mean / std) and then I appended a "1" term to input vectors to represent a bias. My targetValues are just single numbers between 0 and 100.

Here is the relevant parts of my code:

(By my convention, layer I (i) refers to the input layer, J (j) refers to the hidden layer, and K (k) refers to the output layer, which is a single node in this case.)

for train=1:numItrs
        for iterator=1:numTrainingSets

            %%%%%%%% FORWARD PROPAGATION %%%%%%%%

            % Grab the inputs, which are rows of the inputFeatures matrix
            InputLayer = inputFeatures(iterator, :)'; %don't forget to turn into column 
            % Calculate the hidden layer outputs: 
            HiddenLayer = sigmoidVector(WeightMatrixIJ' * InputLayer); 
            % Now the output layer outputs:
            OutputLayer = sigmoidVector(WeightMatrixJK' * HiddenLayer);

            %%%%%%% Debug stuff %%%%%%%% (for single valued output)
            if (mod(train+iterator, 100) == 0)
               str = strcat('Output value: ', num2str(OutputLayer), ' | Test value: ', num2str(targetValues(iterator, :)')); 
               disp(str);
            end 




            %%%%%%%% BACKWARDS PROPAGATION %%%%%%%%

            % Propagate backwards for the hidden-output weights
            currentTargets = targetValues(iterator, :)'; %strip off the row, make it a column for easy subtraction
            OutputDelta = (OutputLayer - currentTargets) .* OutputLayer .* (1 - OutputLayer); 
            EnergyWeightDwJK = HiddenLayer * OutputDelta'; %outer product
            % Update this layer's weight matrix:
            WeightMatrixJK = WeightMatrixJK - epsilon*EnergyWeightDwJK; %does it element by element

            % Propagate backwards for the input-hidden weights
            HiddenDelta = HiddenLayer .* (1 - HiddenLayer) .* WeightMatrixJK*OutputDelta; 
            EnergyWeightDwIJ = InputLayer * HiddenDelta'; 
            WeightMatrixIJ = WeightMatrixIJ - epsilon*EnergyWeightDwIJ; 

        end

    end

And the weight matrices are initialized as follows:

WeightMatrixIJ = rand(numInputNeurons, numHiddenNeurons) - 0.5; 
WeightMatrixJK = rand(numHiddenNeurons, numOutputNeurons) - 0.5; 
%randoms b/w (-0.5, 0.5)

The "sigmoidVector" function takes every element in a vector and applies y = 1 / (1 + exp(-x)).

Here's what the debug messages look like, from the start of the code:

Output value:0.99939 | Test value:20
Output value:0.99976 | Test value:20
Output value:0.99985 | Test value:20
Output value:0.99989 | Test value:55
Output value:0.99991 | Test value:65
Output value:0.99993 | Test value:62
Output value:0.99994 | Test value:20
Output value:0.99995 | Test value:20
Output value:0.99995 | Test value:20
Output value:0.99996 | Test value:20
Output value:0.99996 | Test value:20
Output value:0.99997 | Test value:92
Output value:0.99997 | Test value:20
Output value:0.99997 | Test value:20
Output value:0.99997 | Test value:20
Output value:0.99997 | Test value:20
Output value:0.99998 | Test value:20
Output value:0.99998 | Test value:20
Output value:0.99999 | Test value:20
Output value:0.99999 | Test value:20
Output value:1 | Test value:20
Output value:1 | Test value:62
Output value:1 | Test value:70
Output value:1 | Test value:77
Output value:1 | Test value:20
** stays saturated at 1 **

Obviously I'd like the network to train output values to be between 0 and 100 to try and match those target values!

Thank you for any help, if you need more information I'll provide all I can.

Solution

The sigmoid function is limited to the range (0,1) so it will never hit your target values (since they are all greater than 1). You should scale your target values so the are also in the range of the sigmoid. Since you know your target values are constrained to the range (0,100), just divide them all by 100.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow