Training neural network for function approximation

https://stackoverflow.com/questions/10588862

08-06-2021
|

Question

I've got absolutely no experience with neural networks and for now I'm just playing with FANN library to learn them. So the objective is to train the network to approximate the sine function. For that I'm using 3 layer NN 1 input, 3 hidden and 1 output neuron. the code is

const unsigned int num_input = 1;
const unsigned int num_output = 1;
const unsigned int num_layers = 3;
const unsigned int num_neurons_hidden = 3;

struct fann *ann;

ann = fann_create_standard(num_layers, num_input, num_neurons_hidden, num_output);

fann_set_activation_steepness_hidden(ann, 1);
fann_set_activation_steepness_output(ann, 1);

fann_set_activation_function_hidden(ann, FANN_SIGMOID_SYMMETRIC);
fann_set_activation_function_output(ann, FANN_SIGMOID_SYMMETRIC);

fann_set_train_stop_function(ann, FANN_STOPFUNC_BIT);
fann_set_bit_fail_limit(ann, 0.01f);

fann_set_training_algorithm(ann, FANN_TRAIN_RPROP);

fann_randomize_weights(ann, 0, 1);

for(int i=0; i<2; ++i) {
    for(float angle=0; angle<10; angle+=0.1) {
        float sin_anle = sinf(angle);
        fann_train(ann, &angle, &sin_anle);
    }
}

int k = 0;
for(float angle=0; angle<10; angle+=0.1) {
    float sin_anle = sinf(angle);
    float *o = fann_run(ann, &angle);
    printf("%d\t%f\t%f\t\n", k++, *o, sin_anle);
}

fann_destroy(ann);

However I've got results that has nothing to do with the real sine function. I suppose that there is some fundamental error in my network design.

Solution

You choose the optimization algorithm Resilient Backpropagation (Rprop) in this line:

fann_set_training_algorithm(ann, FANN_TRAIN_RPROP);

Rprop is a batch update algorithm. This means you have to present the whole training set for each update. The documentation for fann_train says

This training is always incremental training (see fann_train_enum), since only one pattern is presented.

So the appropriate optimization option would be FANN_TRAIN_INCREMENTAL. You have to use one of these methods for batch learning: fann_train_on_data, fann_train_on_file or fann_train_epoch.

What I noticed when I changed your code was:

Your steepness is too high. I used the default value (0.5).
You have too few training epochs. I use about 20,000.
Your function is too complex for only 3 hidden neurons. It is not easy at all because it is a periodic function. So I changed the range of the sine function I approximated to [0,3] which is much simpler.
The bit fail limit is too hard. :) I set it to 0.02f.
Rprop is not a very good training algorithm, they should implement something like Levenberg-Marquardt, which is much faster.

The solution I got is not perfect but it is at least approximately correct:

0       0.060097        0.000000
1       0.119042        0.099833
2       0.188885        0.198669
3       0.269719        0.295520
4       0.360318        0.389418
5       0.457665        0.479426
6       0.556852        0.564642
7       0.651718        0.644218
8       0.736260        0.717356
9       0.806266        0.783327
10      0.860266        0.841471
11      0.899340        0.891207
12      0.926082        0.932039
...

I used this modified code:

#include <cstdio>
#include <cmath>
#include <fann.h>
#include <floatfann.h>

int main()
{
  const unsigned int num_input = 1;
  const unsigned int num_output = 1;
  const unsigned int num_layers = 3;
  const unsigned int num_neurons_hidden = 2;

  const float angleRange = 3.0f;
  const float angleStep = 0.1;
  int instances = (int)(angleRange/angleStep);

  struct fann *ann;

  ann = fann_create_standard(num_layers, num_input, num_neurons_hidden, num_output);

  fann_set_activation_function_hidden(ann, FANN_SIGMOID_SYMMETRIC);
  fann_set_activation_function_output(ann, FANN_SIGMOID_SYMMETRIC);

  fann_set_train_stop_function(ann, FANN_STOPFUNC_BIT);
  fann_set_bit_fail_limit(ann, 0.02f);

  fann_set_training_algorithm(ann, FANN_TRAIN_INCREMENTAL);

  fann_randomize_weights(ann, 0, 1);

  fann_train_data *trainingSet;
  trainingSet = fann_create_train(instances, 1, 1); // instances, input dimension, output dimension
  float angle=0;
  for(int instance=0; instance < instances; angle+=angleStep, instance++) {
      trainingSet->input[instance][0] = angle;
      trainingSet->output[instance][0] = sinf(angle);
  }

  fann_train_on_data(ann, trainingSet, 20000, 10, 1e-8f); // epochs, epochs between reports, desired error

  int k = 0;
  angle=0;
  for(int instance=0; instance < instances; angle+=angleStep, instance++) {
      float sin_angle = sinf(angle);
      float *o = fann_run(ann, &angle);
      printf("%d\t%f\t%f\t\n", k++, *o, sin_angle);
  }

  fann_destroy(ann);

  return 0;
}

Note that fann_create_train is available since FANN 2.2.0.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow