CNN model contains several images that are null

https://datascience.stackexchange.com/questions/75640

11-12-2020
|

Question

I'm using a deep CNN with the ReLU activation function. When visualizing the layers (each with 32 filters), several of the filtered images are zeros.

I am trying to reason why this may be happening? What would cause the entire image to be zero? I can understand having a sparse image (given the ReLU), but what would cause all zeros?

I hope this is clear and has sufficient detail to answer appropriately.

Solution

In principle, it is to be expected that several feature maps result in zero output all over the image. What the convolutional layers are doing, essentially, is to move a stencil across the image and give a high value where this stencil shape is found in the image, and zero everywhere else. Each of the feature maps, i.e. each of the channels in the convolutional layers, corresponds to a stencil, the lower layers encode simple shapes, the higher layers more complex ones.

If a channel outputs zero all over, it simply means that this feature was not detected in the image.

As an example, consider an MNIST classifier. One feature map may have learned to look for straight horizontal lines. When the model is presented with a hand-written "7", the straight horizontal lines will be detected in the top part of the image and maybe also in the middle. So the feature map will give high values in the top part and maybe in the middle as well, while it will be zero everywhere else.

If the model is presented with a hand-written "0", there is usually no horizontal lines and therefore the feature map will be zero everywhere.

So if a feature map is zero everywhere, it usually means that the particular feature is not found in the current input, but it might be present in other images.

If, however, a feature map is all zeros for all input images, i.e. the channel in the convolutional layer is zero everywhere for all 60k MNIST examples, then there might something wrong with the network and you might want to apply batch normalization or dropout.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange