Question

I am building a model based on ZFNet in Tensorflow 2.0. I am using the Petal images dataset. The images are of size 224x244x3. So my question is when implementing the first layer (conv2d) with filter size = 7 and a stride of 3 and padding of 0. I am getting the output dimension of 109.5 using formula (n+2p-f/S + 1). So if I use the above-mentioned values what will be the dimension returned by TensorFlow in the first layer. and secondly, how can I adjust the parameter values so it returns a whole number.

reference formula : (n+2p-f)/2 +1

reference calculations: 224+0-7/2 +1 = 109.5

Thanks.

Was it helpful?

Solution

As per the formula for the feature map dimension:

$$feature_{dim} = \frac{n+2p-f}{S} + 1$$

The values for :

  • n = 224
  • p = 0
  • f = 7
  • s = 3

$$feature_{dim} = \frac{224+2*0-7}{3} + 1 = 73.66 $$

As you've guessed this is not the size of the feature map.

Tensorflow takes it as 73.

If you're relying on the formula, you are missing out on a concept, that this should a process were in the kernel slides over the Input hence the feature map dimension should be an integer. So what happens is that the kernel while sliding with a stride of 3 leaves out the last few pixels and won't reach the other edge.

If you're trying to get a feature map that's an integer by keeping a constant filter size of 7. Then you're stride is :

$$S = 217 /(N - 1)$$

where N is you're desired output size.

If you choose N to be 8 or 32, you'll end up with a stride of 7 or 31. It's better to choose S = 7 to get most of the information. But still, it doesn't matter as TensorFlow has checks for the same to prevent errors.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top