非結合関数近似のニューラルネットワークはどのように構造化されるべきですか？

https://datascience.stackexchange.com/questions/9495

16-10-2019
|

質問

多層パーセプトロンは、十分なニューロンを考えると、あらゆる関数を任意に正確に近似できると聞いたことがあります。試してみたかったので、次のコードを書きました。

#!/usr/bin/env python

"""Example for learning a regression."""


import tensorflow as tf
import numpy


def plot(xs, ys_truth, ys_pred):
    """
    Plot the true values and the predicted values.

    Parameters
    ----------
    xs : list
        Numeric values
    ys_truth : list
        Numeric values, same length as `xs`
    ys_pred : list
        Numeric values, same length as `xs`
    """
    import matplotlib.pyplot as plt
    truth_plot, = plt.plot(xs, ys_truth, '-o', color='#00ff00')
    pred_plot, = plt.plot(xs, ys_pred, '-o', color='#ff0000')
    plt.legend([truth_plot, pred_plot],
               ['Truth', 'Prediction'],
               loc='upper center')
    plt.savefig('plot.png')


# Parameters
learning_rate = 0.1
momentum = 0.6
training_epochs = 1000
display_step = 100

# Generate training data
train_X = []
train_Y = []

# First simple test: a linear function
f = lambda x: x+4

# Second, more complicated test: x^2
# f = lambda x: x**2

for x in range(-20, 20):
    train_X.append(float(x))
    train_Y.append(f(x))
train_X = numpy.asarray(train_X)
train_Y = numpy.asarray(train_Y)
n_samples = train_X.shape[0]

# Graph input
X = tf.placeholder(tf.float32)
reshaped_X = tf.reshape(X, [-1, 1])
Y = tf.placeholder("float")

# Create Model
W1 = tf.Variable(tf.truncated_normal([1, 100], stddev=0.1), name="weight")
b1 = tf.Variable(tf.constant(0.1, shape=[1, 100]), name="bias")
mul = tf.matmul(reshaped_X, W1)
h1 = tf.nn.sigmoid(mul) + b1
W2 = tf.Variable(tf.truncated_normal([100, 100], stddev=0.1), name="weight")
b2 = tf.Variable(tf.constant(0.1, shape=[100]), name="bias")
h2 = tf.nn.sigmoid(tf.matmul(h1, W2)) + b2
W3 = tf.Variable(tf.truncated_normal([100, 1], stddev=0.1), name="weight")
b3 = tf.Variable(tf.constant(0.1, shape=[1]), name="bias")
# identity as activation to get arbitrary output
activation = tf.matmul(h2, W3) + b3

# Minimize the squared errors
l2_loss = tf.reduce_sum(tf.pow(activation-Y, 2))/(2*n_samples)
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(l2_loss)

# Initializing the variables
init = tf.initialize_all_variables()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    # Fit all training data
    for epoch in range(training_epochs):
        for (x, y) in zip(train_X, train_Y):
            sess.run(optimizer, feed_dict={X: x, Y: y})

        # Display logs per epoch step
        if epoch % display_step == 0:
            cost = sess.run(l2_loss, feed_dict={X: train_X, Y: train_Y})
            print("cost=%s\nW1=%s" % (cost, sess.run(W1)))

    print("Optimization Finished!")
    print("cost=%s W1=%s" %
          (sess.run(l2_loss, feed_dict={X: train_X, Y: train_Y}),
           sess.run(W1)))  # "b2=", sess.run(b2)

    # Get output and plot it
    ys_pred = []
    ys_truth = []

    test_X = []
    for x in range(-40, 40):
        test_X.append(float(x))

    for x in test_X:
        ret = sess.run(activation, feed_dict={X: x})
        ys_pred.append(list(ret)[0][0])
        ys_truth.append(f(x))
    plot(train_X.tolist(), ys_truth, ys_pred)

この種は、線形関数に対して機能します（少なくともトレーニングデータでは、範囲外のテストデータではそれほどではありません）：

ただし、非線形関数$ x^2 $ではまったく機能しません。

なぜこのニューラルネットワークは、このような単純な関数近似で機能しないのですか？両方の機能で同じネットワークトポロジを機能させるために何を変更する必要がありますか？

解決

これはあなたの質問に直接答えることはありませんが、いくつかの役立つポインターが含まれるかもしれません。

最近画像認識のための深い残留学習書かれている論文：

複数の非線形層が漸近的に複雑な関数を漸近的に近似できるという仮説を立てている場合[...

しかし、この仮説はまだ未解決の問題です。見る [28].

ライセンス： CC-BY-SA と帰属

所属していません datascience.stackexchange