Question

I want to ask the following question : I am trying to train an artificial neural network with backpropagation. I have a feedforward neural network with 6 input layers 7 hidden and 1 output. I will give to this neural network a feature vector made up of 6 features and train it, my learning rate is 0.7 and my momentum is 0.9. I want to classify based on my 6 features in 2 classes. The problem is that the overall error of this network doesn't change... I have tried different values for the learning rates and momentum but the problem was still the same... I don't understand why it is dooing this. I have tried the same code(I mean the main classes) when trying to learn a ann to learn to do the xor problem and it worked there perfectly. Does anyone have any idea why this is happening? Thank you for your time :)

FeedforwardNetwork network = new FeedforwardNetwork();
Train train;

 network.AddLayer(new FeedforwardLayer(6));
        network.AddLayer(new FeedforwardLayer(7));
        network.AddLayer(new FeedforwardLayer(1));

        train = new Backpropagation(network, Input_vector, Ideal_vector, 0.7, 0.8);

        int epoch = 1;
        textBox7.Text = " It has begun\r\n";
        do
        {
            train.Iteration();
            textBox7.Text += "\r\n Epoch " + epoch + " Error " + train.Error + " \r\n ";

            epoch++;
        }
        while ((epoch < 500) && (train.Error > 0.001));

        network = train.Network;

        textBox7.Text += "Neural Network Results";

        for (int i = 0; i < Ideal_vector.Length; i++)
        {
            double[] actual = network.ComputeOutputs(Input_vector[i]);

            textBox7.Text += "\r\n" + Input_vector[i][0] + "," + Input_vector[i][1] +     "," +
                Input_vector[i][2] + "," + Input_vector[i][3] + Input_vector[i][4] +
                Input_vector[i][5] + " actual= " +
                actual[0] + ", ideal " + Ideal_vector[i][0] + " \r\n";
        }
Was it helpful?

Solution 3

Your main function is fine. However either your training vectors or your backpropagation code is not (assuming your network is big enough to learn this). So this is going to be a bunch of question instead of an answer, but you may get the right idea:

  • How many samples does your training vector include?
  • Are those samples roughly classified half/half or is there a bias?
  • Are there identical training samples that are classified ambiguously?
  • How is the error calculated? Abs/Sqr average?
  • Do you randomize the initial network weights?
  • What is the initial error before training?
  • Does the error change in the first iteration?
  • Can you post the code on pastebin?

OTHER TIPS

Are you using batch learning or online learning? If the answer is batch, then maybe your learning rate is too high. You can try scaling it dividing for the number of training patterns. As @Marcom said, if you have too few neurons your network has too low capacity, that's a bit rough to explain but basically you aren't using the non-linear region of the neurons and your network is biased.

Check here for a better explanation.

Try with a huge number of neurons first, then you can decrease the number as long as the error keeps going down.

Try experimenting with adding an additional hidden layer and also try increasing the number of hidden nodes. I can't give you a technical explanation off my head but if you have too few nodes the ann might not be able to converge.

A loss function not evolving at the start in an MLP is usually because the network can't infer any rules to fit your training data (the grad of your backprop can't find any meaningful local minima) . This can be caused by a lack of data for the problem you try to resolve, or a restricted architecture.

Increasing your number of layers and/or the size of them should change that. Although you will be prone to overfitting if your architecture is too complex. You will have to find a balance fitting to your problem.

And don't hesitate to start with a low learning rate at first, setting it too high will cause your gradient to "bounce" and not converge.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top