Question

Im confused about what PyTorchs padding parameter does when using torch.nn.ConvTranspose2d. The docs say that:

"The padding argument effectively adds dilation * (kernel_size - 1) - padding amount of zero padding to both sizes of the input".

So my guess was that the dimensions of the feature maps increase when applying padding. However running test they decrease:

inp = torch.ones((1, 1, 2, 2))
conv_no_pad = nn.ConvTranspose2d(1, 1, kernel_size=(3, 3), stride=2, padding=0)
conv_pad = nn.ConvTranspose2d(1, 1, kernel_size=(3, 3), stride=2, padding=1)
print(conv_no_pad(inp).shape)
# => (1, 1, 5, 5)
print(conv_pad(inp).shape)
# => (1, 1, 3, 3)

Can somebody explain how the padding works?

Was it helpful?

Solution

As you quoted The padding argument effectively adds dilation * (kernel_size - 1) - padding, so you see the padding value is subtracted, the resulting shape becomes lower. It's a reverse (in some sense) operation to Conv2d, which means the arguments work the opposite way here. And I think this behavior is introduced to make it easier to design neural nets with symmetric architecture (like autoencoders) -- you just copy the parameters of kernel size, stride and padding from the corresponding Conv2d layer and get an operation which preserves the input shape of an image.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top