What exactly means CNN is position equivariant

https://datascience.stackexchange.com/questions/20404

convolution

30-10-2019
|

Question

There is quite a good explanation which fully comply with my vision. But seems it lacks one final step. As Jean states, moving an object significantly in the input image will cause the change in which neuron is activated in the yellow layer (the one previous to the first fully connected layer). So that we see that the part of the network before FCL is position equivariant. Then author says that because the network detects an object at any location, the FCLs should have taken care of it.

Does the equivariance property hold for the whole network, including trailing classifier? (I've read Difference between "equivariant to translation" and "invariant to translation" but not sure I apply it to this case correctly)
Should a network detect an object in right-top corner if only trained with images with that object at left-bottom corner?

(Tried this demo but it doesn't seem to address the second question)

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange