سؤال

Im confused about what PyTorchs padding parameter does when using torch.nn.ConvTranspose2d. The docs say that:

"The padding argument effectively adds dilation * (kernel_size - 1) - padding amount of zero padding to both sizes of the input".

So my guess was that the dimensions of the feature maps increase when applying padding. However running test they decrease:

inp = torch.ones((1, 1, 2, 2))
conv_no_pad = nn.ConvTranspose2d(1, 1, kernel_size=(3, 3), stride=2, padding=0)
conv_pad = nn.ConvTranspose2d(1, 1, kernel_size=(3, 3), stride=2, padding=1)
print(conv_no_pad(inp).shape)
# => (1, 1, 5, 5)
print(conv_pad(inp).shape)
# => (1, 1, 3, 3)

Can somebody explain how the padding works?

هل كانت مفيدة؟

المحلول

As you quoted The padding argument effectively adds dilation * (kernel_size - 1) - padding, so you see the padding value is subtracted, the resulting shape becomes lower. It's a reverse (in some sense) operation to Conv2d, which means the arguments work the opposite way here. And I think this behavior is introduced to make it easier to design neural nets with symmetric architecture (like autoencoders) -- you just copy the parameters of kernel size, stride and padding from the corresponding Conv2d layer and get an operation which preserves the input shape of an image.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى datascience.stackexchange
scroll top