Up to which layer can we consider the encoder to be?
-
13-12-2020 - |
Question
I'm trying to extract the encoder from a U-Net network.
Given its architecture:
And its summary:
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 200, 200, 1) 0
__________________________________________________________________________________________________
conv1_1 (Conv2D) (None, 200, 200, 64) 1664 input_1[0][0]
__________________________________________________________________________________________________
conv1_2 (Conv2D) (None, 200, 200, 64) 102464 conv1_1[0][0]
__________________________________________________________________________________________________
pool1 (MaxPooling2D) (None, 100, 100, 64) 0 conv1_2[0][0]
__________________________________________________________________________________________________
conv2_1 (Conv2D) (None, 100, 100, 96) 55392 pool1[0][0]
__________________________________________________________________________________________________
conv2_2 (Conv2D) (None, 100, 100, 96) 83040 conv2_1[0][0]
__________________________________________________________________________________________________
pool2 (MaxPooling2D) (None, 50, 50, 96) 0 conv2_2[0][0]
__________________________________________________________________________________________________
conv3_1 (Conv2D) (None, 50, 50, 128) 110720 pool2[0][0]
__________________________________________________________________________________________________
conv3_2 (Conv2D) (None, 50, 50, 128) 147584 conv3_1[0][0]
__________________________________________________________________________________________________
pool3 (MaxPooling2D) (None, 25, 25, 128) 0 conv3_2[0][0]
__________________________________________________________________________________________________
conv4_1 (Conv2D) (None, 25, 25, 256) 295168 pool3[0][0]
__________________________________________________________________________________________________
conv4_2 (Conv2D) (None, 25, 25, 256) 1048832 conv4_1[0][0]
__________________________________________________________________________________________________
pool4 (MaxPooling2D) (None, 12, 12, 256) 0 conv4_2[0][0]
__________________________________________________________________________________________________
conv5_1 (Conv2D) (None, 12, 12, 512) 1180160 pool4[0][0]
__________________________________________________________________________________________________
conv5_2 (Conv2D) (None, 12, 12, 512) 2359808 conv5_1[0][0]
__________________________________________________________________________________________________
up_conv5 (UpSampling2D) (None, 24, 24, 512) 0 conv5_2[0][0]
__________________________________________________________________________________________________
crop_conv4 (Cropping2D) (None, 24, 24, 256) 0 conv4_2[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 24, 24, 768) 0 up_conv5[0][0]
crop_conv4[0][0]
__________________________________________________________________________________________________
conv6_1 (Conv2D) (None, 24, 24, 256) 1769728 concatenate[0][0]
__________________________________________________________________________________________________
conv6_2 (Conv2D) (None, 24, 24, 256) 590080 conv6_1[0][0]
__________________________________________________________________________________________________
up_conv6 (UpSampling2D) (None, 48, 48, 256) 0 conv6_2[0][0]
__________________________________________________________________________________________________
crop_conv3 (Cropping2D) (None, 48, 48, 128) 0 conv3_2[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 48, 48, 384) 0 up_conv6[0][0]
crop_conv3[0][0]
__________________________________________________________________________________________________
conv7_1 (Conv2D) (None, 48, 48, 128) 442496 concatenate_1[0][0]
__________________________________________________________________________________________________
conv7_2 (Conv2D) (None, 48, 48, 128) 147584 conv7_1[0][0]
__________________________________________________________________________________________________
up_conv7 (UpSampling2D) (None, 96, 96, 128) 0 conv7_2[0][0]
__________________________________________________________________________________________________
crop_conv2 (Cropping2D) (None, 96, 96, 96) 0 conv2_2[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 96, 96, 224) 0 up_conv7[0][0]
crop_conv2[0][0]
__________________________________________________________________________________________________
conv8_1 (Conv2D) (None, 96, 96, 96) 193632 concatenate_2[0][0]
__________________________________________________________________________________________________
conv8_2 (Conv2D) (None, 96, 96, 96) 83040 conv8_1[0][0]
__________________________________________________________________________________________________
up_conv8 (UpSampling2D) (None, 192, 192, 96) 0 conv8_2[0][0]
__________________________________________________________________________________________________
crop_conv1 (Cropping2D) (None, 192, 192, 64) 0 conv1_2[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 192, 192, 160 0 up_conv8[0][0]
crop_conv1[0][0]
__________________________________________________________________________________________________
conv9_1 (Conv2D) (None, 192, 192, 64) 92224 concatenate_3[0][0]
__________________________________________________________________________________________________
conv9_2 (Conv2D) (None, 192, 192, 64) 36928 conv9_1[0][0]
__________________________________________________________________________________________________
conv9_3 (ZeroPadding2D) (None, 200, 200, 64) 0 conv9_2[0][0]
__________________________________________________________________________________________________
conv10_1 (Conv2D) (None, 200, 200, 1) 65 conv9_3[0][0]
==================================================================================================
Total params: 8,740,609
Trainable params: 8,740,609
Non-trainable params: 0
I think it goes to conv5_2
(it's the last one that have 512
channels). But I don't understand what does the bottom part:
Up to which layer can we consider the encoder to be?
Solution
An encoder maps data from a higher dimensional space to a lower one. So I would say the encoder ends at conv5_2
after that the upsampling starts.
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange