How to run 2-layer perceptron to solve XOR

https://softwareengineering.stackexchange.com/questions/333048

30-12-2020
|

Вопрос

XOR is not solvable by using a single perceptron with standard scalar product and unit step function.

This article suggests using 3 perceptron to make a network: http://toritris.weebly.com/perceptron-5-xor-how--why-neurons-work-together.html

I'm trying to run the 3-perceptron network this way but it doesn't produce correct results for XOR:

//pseudocode
class perceptron {

  constructor(training_data) {
    this.training_data = training_data   
  }

  train() {
    iterate multiple times over training data
    to train weights
  }

  unit_step(value) {
    if (value<0) return 0
    else return 1
  }

  compute(input) {
    weights = this.train()
    sum     = scalar_product(input,weights)
    return unit_step(sum)
  }
}

The above perceptron can solve NOT, AND, OR bit operations correctly. This is how I use 3 perceptrons to solve XOR:

AND_perceptron = perceptron([
  {Input:[0,0],Output:0},
  {Input:[0,1],Output:0},
  {Input:[1,0],Output:0},
  {Input:[1,1],Output:1}
])

OR_perceptron = perceptron([
  {Input:[0,0],Output:0},
  {Input:[0,1],Output:1},
  {Input:[1,0],Output:1},
  {Input:[1,1],Output:1}
])

XOR_perceptron = perceptron([
  {Input:[0,0],Output:0},
  {Input:[0,1],Output:1},
  {Input:[1,0],Output:1},
  {Input:[1,1],Output:0}
])

test_x1 = 0
test_x2 = 1 

//first layer of perceptrons
and_result   = AND_perceptron.compute(test_x1,test_x2)
or_result    = OR_perceptron.compute(test_x1,test_x2)

//second layer
final_result = XOR_perceptron.compute(and_result,or_result)

The final_result above is not consistent, sometimes 0, sometimes 1. It seems I run the 2 layers wrongly. How to run these 3 perceptrons in 2 layers the correct way?

Решение

You seem to be attempting to train your second layer's single perceptron to produce an XOR of its inputs. This isn't possible; a single perceptron can only learn to classify inputs that are linearly separable.

The usual solution to solving the XOR problem with perceptrons is to use a two-layer network with the back propagation algorithm, so that the hidden layer nodes learn to classify one each of the two linearly-seperable regions of output and the final output layer combines those results either additively or multiplicatively (depending on which precise regions the hidden nodes have learned about). I'm really not entirely sure what your "and" and "or" perceptrons in your psuedocode are supposed to achieve, but they aren't part of the usual solution.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с softwareengineering.stackexchange