Keras Attention Guided CNN problem

https://datascience.stackexchange.com/questions/43058

01-11-2019
|

Question

I am working on a CNN for XRay image classification and I can't seem to be able to properly train it. I am trying to implement the following paper in Keras: https://arxiv.org/pdf/1801.09927.pdf

In short, the paper describes a 3 network architecture.

The Global Branch is a ResNet or DenseNet (I used a DenseNet) pretrained on imagenet, then trained on the CheXNet dataset. After the training is done, the activations from the last convolutional layer are used to create heatmaps, which are then used to crop the images in the original dataset and create a new dataset.

The Local Branch is a ResNet or DenseNet (I used a DenseNet) with no pretraining. It is trained on the cropped dataset.

Finally, the fusion branch has as inputs the 2 global pooling layers of the Global and Local Branches.

I have managed to train the Global Branch (AUC: 0.77), generate the crops and train the Local Branch (AUC: 0.67). But when I try to train the Fusion Branch, the val_loss doesn't decrease:

This is my code:

def load_model_from_json(models_folder):
    print(" --- Reading model from ", models_folder)
    with open(models_folder + 'model.json', 'r') as f:
        json = f.read()
    print("Read: ", models_folder + 'model.json')
    model = model_from_json(json)

    model.load_weights(models_folder + "model_weights.h5")
    print("Read: ", models_folder + "model_weights.h5")

    return model

global_branch_model = load_model_from_json(self.global_branch_path)

local_branch_model = load_model_from_json(self.local_branch_path)

for l in global_branch_model.layers:
    l.trainable = False
    l.name = 'global_'+l.name
for l in local_branch_model.layers:
    l.name = 'local_'+l.name
    l.trainable = False

global_pooling = global_branch_model.get_layer('global_global_average_pooling2d_1')

local_pooling = local_branch_model.get_layer('local_global_average_pooling2d_1')

merged = concatenate([global_pooling.output, local_pooling.output])
dense = Dense(512, activation='relu')(merged)
dropout = Dropout(self.hyperparameters.dropout)(dense)
out = Dense(1, activation='sigmoid')(dropout)

fusion_model = Model(inputs=[global_branch_model.input, local_branch_model.input], outputs=out)
loss_function = unweighted_binary_crossentropy

optimizer = AdamW(lr=5e-5)

fusion_model.compile(optimizer=optimizer, loss=loss_function)

fusion_model.fit_generator(
            generator=FusionDataGenSequence(self.labels, self.partition['train'],
                                      current_state='train',
                                      batch_size=self.batch_size,
                                      hyperparameters=self.hyperparameters,
                                      num_classes=self.number_of_classes),
            epochs=self.hyperparameters.epochs,
            verbose=1,
            callbacks=callbacks,
            workers=self.num_workers,
            # max_queue_size=32,
            # shuffle=False,
            validation_data=FusionDataGenSequence(self.labels,
                                            self.partition['valid'],
                                            current_state='validation',
                                            batch_size=self.batch_size,
                                            hyperparameters=self.hyperparameters,
                                            num_classes=self.number_of_classes)
            # validation_steps=1
        )

Can you please help me figure out what I did wrong? Thank you!

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange