why does Wikipedia's perceptron correctly separate XOR?

https://stackoverflow.com/questions/21143566

28-09-2022
|

Вопрос

I understand a perceptron can only work correctly on linearly separable sets, like the outputs of the NAND, AND, OR functions. I've been reading Wikipedia's entry on the perceptron, and got to play with its code.

XOR is a case where a single layer perceptron should fail, as it's not a linearly separable set.

#xor
print ("xor")
t_s           = [((1, 1, 1), 0), ((1, 0, 1), 1), ((1, 1, 0), 1), ((1, 1, 1), 0)] 


threshold     = 0.5
learning_rate = 0.1
w             = [0, 0, 0]

def dot_product(values, weights):
    return sum(value * weight for value, weight in zip(values, weights))

def train_perceptron(threshold, learning_rate, weights, training_set):
    while True:
        #print('-' * 60)
        error_count = 0

        for input_vector, desired_output in training_set:
            #print(weights)
            result = dot_product(input_vector, weights) > threshold
            error  = desired_output - result

            if error != 0:
                error_count += 1
                for index, value in enumerate(input_vector):
                    weights[index] += learning_rate * error * value

        if error_count == 0: #iterate till there's no error 
            break
    return training_set

t_s = train_perceptron(threshold, learning_rate, w, t_s)

t_s = [(a[1:], b) for a, b in t_s]

for a, b in t_s:
    print "input: " + str(a) + ", output: " + str(b)

The output for this Ideone run is correct for XOR. How come?

xor
input: (1, 1), output: 0
input: (0, 1), output: 1
input: (1, 0), output: 1
input: (1, 1), output: 0

Решение

You input t_s into train_perceptron and return it without modifying. Then you output it. Of course that works perfectly....

t_s = train_perceptron(threshold, learning_rate, w, t_s)

This does not change t_s at all. train_perceptron does at no point modify training_set,. but returns it: return training_set

Then here you output it:

t_s = [(a[1:], b) for a, b in t_s]

for a, b in t_s:
    print "input: " + str(a) + ", output: " + str(b)

Другие советы

Try changing your training set:

t_s = [((1, 1, 1), 0), ((1, 0, 1), 1), ((1, 1, 0), 1), ((0, 0, 0), 0)]

If my memory is correct to break non-linear problem with a perceptron you need at least one hidden layer with non-linear activation for neurons in this layer.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow