Theano/Lasagne/NoLearn神经网络图像输入

https://datascience.stackexchange.com/questions/5547

16-10-2019
|

题

我正在处理图像分类任务，并决定将宽面条 + NoLearn用于神经网络原型。所有标准示例（例如MNIST数字分类）运行良好，但是当我尝试使用自己的图像时，出现问题。

我想使用3通道图像，而不是灰度。还有我试图从图像中获得数组的代码：

 img = Image.open(item)
 img = ImageOps.fit(img, (256, 256), Image.ANTIALIAS)
 img = np.asarray(img, dtype = 'float64') / 255.
 img = img.transpose(2,0,1).reshape(3, 256, 256)   
 X.append(img)

这是NN及其拟合的守则：

X, y = simple_load("new")

X = np.array(X)
y = np.array(y)


net1 = NeuralNet(
    layers=[  # three layers: one hidden layer
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    # layer parameters:
    input_shape=(None, 65536),  # 96x96 input pixels per batch
    hidden_num_units=100,  # number of units in hidden layer
    output_nonlinearity=None,  # output layer uses identity function
    output_num_units=len(y),  # 30 target values

    # optimization method:
    update=nesterov_momentum,
    update_learning_rate=0.01,
    update_momentum=0.9,

    regression=True,  # flag to indicate we're dealing with regression problem


       max_epochs=400,  # we want to train this many epochs
        verbose=1,
        )

  net1.fit(X, y)

我收到这样的例外：

Traceback (most recent call last):
  File "las_mnist.py", line 39, in <module>
    net1.fit(X[i], y[i])
  File "/usr/local/lib/python2.7/dist-packages/nolearn/lasagne.py", line 266, in fit
    self.train_loop(X, y)
  File "/usr/local/lib/python2.7/dist-packages/nolearn/lasagne.py", line 273, in train_loop
    X, y, self.eval_size)
  File "/usr/local/lib/python2.7/dist-packages/nolearn/lasagne.py", line 377, in train_test_split
    kf = KFold(y.shape[0], round(1. / eval_size))
IndexError: tuple index out of range

那么，您在哪种格式中使用图像数据“馈送”网络？感谢您的答案或任何提示！

解决方案 2

我还在烤宽面条用户论坛上问了一下，Oliver Duerr通过代码示例为我提供了很多帮助：https://groups.google.com/forum/# !! topic/lasagne-users/8za7hr2wkfm

其他提示

出于好奇：为什么要使用3通道图像？我也在简历中工作，据我所知，刻张图像是标准的。特别是对于一开始没有颜色的MNIST，似乎没有使用颜色有任何好处。

我还认为，使用灰度有一个直观的理由 - 颜色通常是一个混杂的变量。如果您要看一个红色的“ 1”和蓝色“ 1”，您会说“嘿！都是两个！它们只是不同的颜色”。但是，计算机比您或I。是蓝色吗？我以前从未见过这样的东西！”请记住，与颜色（255、0、0）和（0、255、255）的差异有多大，以及图像的差异有多轻（尤其是在MNIST舒适的操场之外）。

无论如何，我看到的大多数诺尔恩示例都使用了数据形状，例如：

X = X.reshape(-1, 1, size, size)

如果您的图像刻在图像中，则可以将其归因于这种形状。不幸的是，我不确定如何将彩色数据塞入NoLearn并获得您想要的结果。

许可以下： CC-BY-SA 和归因

不隶属于 datascience.stackexchange