how to calculate the output shape of conv2d_transpose?

https://datascience.stackexchange.com/questions/26451

31-10-2019
|

Pergunta

Currently I code a GAN to generate MNIST numbers but the generator doesnt want to work. First I choose z with shape 100 per Batch, put into a layer to get into the shape (7,7, 256). Then conv2d_transpose layer to into 28, 28, 1. (which is basically a mnist pic)

I have two questions 1.) This code doesn't work for obvious. Do you have any clue, why? 2.) I am very aware how transpose convolution works but I can't find any resource to calculate the output size given input, strides and kernel size specific to Tensorflow. The useful information I found is https://arxiv.org/pdf/1603.07285v1.pdf but e.g. padding in Tensorflow works very different. Can you help me?

mb_size = 32 #Size of image batch to apply at each iteration.
X_dim = 784
z_dim = 100
h_dim = 7*7*256
dropoutRate = 0.7
alplr = 0.2 #leaky Relu


def generator(z, G_W1, G_b1, keepProb, first_shape):

    G_W1 = tf.Variable(xavier_init([z_dim, h_dim]))
    G_b1 = tf.Variable(tf.zeros(shape=[h_dim]))    


    G_h1 = lrelu(tf.matmul(z, G_W1) + G_b1, alplr)
    G_h1Drop = tf.nn.dropout(G_h1, keepProb)  # drop out

    X = tf.reshape(G_h1Drop, shape=first_shape)
    out = create_new_trans_conv_layer(X, 256, INPUT_CHANNEL, [3, 3], [2,2], "transconv1", [-1, 28, 28, 1])    
    return out




# new transposed cnn
def create_new_trans_conv_layer(input_data, num_input_channels, num_output_channels, filter_shape, stripe, name, output_shape):
    # setup the filter input shape for tf.nn.conv_2d
    conv_filt_shape = [filter_shape[0], filter_shape[1], num_output_channels, num_input_channels]


    # initialise weights and bias for the filter
    weights = tf.Variable(tf.truncated_normal(conv_filt_shape, stddev=0.03),
                          name=name + '_W')
    bias = tf.Variable(tf.truncated_normal([num_input_channels]), name=name + '_b')

    # setup the convolutional layer operation
    conv1 = tf.nn.conv2d_transpose(input_data, weights, output_shape, [1, stripe[0], stripe[1], 1], padding='SAME')

    # add the bias
    conv1 += bias

    # apply a ReLU non-linear activation

    conv1 = lrelu(conv1, alplr)

    return conv1


...


    _, G_loss_curr = sess.run(
        [G_solver, G_loss],
        feed_dict={z: sample_z(mb_size, z_dim), keepProb: 1.0} #training generator

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange